diff --git a/.agents/commands/specledger.verify-workflow.md b/.agents/commands/specledger.verify-workflow.md new file mode 100644 index 0000000..56eb76f --- /dev/null +++ b/.agents/commands/specledger.verify-workflow.md @@ -0,0 +1,107 @@ +--- +description: EXPERIMENTAL — cross-artifact verification WITHOUT tasks.md. Fans out N INDEPENDENT reviewers (spec → plan/research/data-model/contracts/quickstart) via a deterministic Workflow, then MERGES their findings into one report (independent passes catch complementary issues). Read-only. Run from a FRESH session at DEFAULT effort. +handoffs: + - label: Implement (workflow) + agent: specledger.implement-workflow + prompt: Implement the verified feature via the workflow pipeline +--- + +## User Input + +```text +$ARGUMENTS +``` + +Optional `$ARGUMENTS`: number of independent reviewers (default 2), or extra focus (e.g. "emphasize security", a feature id). + +## Purpose + +Read-only cross-artifact consistency verification for a feature whose **`tasks.md` was intentionally not generated** (e.g. when using `/specledger.implement-workflow`). It validates **`spec.md` against `plan.md`, `research.md`, `data-model.md`, `contracts/*.md`, and `quickstart.md`** — the planning artifacts — and produces a Specification Analysis Report. + +**Why a workflow, not a single pass:** independent reviewers reliably catch *different* things. (Real example: one pass caught a committed-tree-vs-working-tree spec↔plan ambiguity; another caught a stale acceptance scenario the first missed.) So this runs **N independent reviewers in parallel** and **merges** them — keeping complementary findings, deduping overlap. + +> **STRICTLY READ-ONLY.** Reviewers and the merge step report findings and recommend fixes; they do **not** edit artifacts. The only write is the optional, explicit save of the report (last step). +> **Do NOT flag the absence of `tasks.md`** — it is intentional for this flow. Skip the task↔requirement and `TestQuickstart_*`-task-mapping checks. + +> **Pause for model and effort.** Subagents inherit the launcher's *model* (this script leaves `model` unset). Before launching, **AskUserQuestion which model** the reviewers should use and offer to pause to change effort because **effort is inherited** from the launching session — so a cheap, lower-effort session keeps the fan-out costs under control. + +## What each reviewer checks (the four focus areas) + +1. **Coverage** — every Functional Requirement (FR-*) and Success Criterion (SC-*) in `spec.md` is covered by the plan + a contract + a `TestQuickstart_*` scenario. Flag any requirement with no downstream coverage. +2. **Reverse traceability** — every contract behavior and every quickstart scenario traces back to a spec requirement (no invented behavior ungrounded in the spec). +3. **Consistency** — no contradictions across artifacts (exit codes, lock schema, fingerprint semantics, origin resolution, etc.); in particular flag any **AMBIGUITY that would let an implementer/model decide a behavior two different ways**. +4. **Decision integrity** — the spec's recorded clarification decisions are applied **consistently everywhere they appear**, with **no leftover stale wording**. + +Constitution (`.specledger/memory/constitution.md`) is in scope: a MUST-principle conflict is automatically CRITICAL. + +## Execution steps + +1. **Locate artifacts**: run `sl spec info --json --paths-only`; read `FEATURE_DIR`. (The reviewers Read the artifacts themselves.) +2. **Discover relevant skills**: enumerate the skills available in the session (the available-skills list surfaced by the harness; or invoke `/find-skills` for a gap). **focus on design skills** — e.g. cobra, agentic CLI design, Supabase Architecture, REST and data modeling. Workflow subagents **do** have the `Skill` tool (verified empirically), so every review agent prompt **can and MUST** instruct the agent to load its relevant design and architecture skills via the `Skill` tool *before* reviewing artifacts. Record the review skills you'll bake into the brief. +3. **AskUserQuestion**: Batch which `model` for the reviewers (note effort is inherited from this session) together with the relevant skills (multiple selections allowed). Pass the `model` (or leave `model` unset to inherit) into the script. +4. **Write the feature-specific reviewer brief to disk** at `FEATURE_DIR/reviews/_reviewer-brief.md` (create `reviews/` if needed). It carries everything reviewer-facing — the **SKILLS line** (the chosen skills to load), the read-only rule, the artifact list, the feature context, the four focus areas, and the constitution note. **Why on disk:** the workflow script is plain JS, and embedding long multi-paragraph prompts as string literals is parse-fragile (a stray `/*` glob, an unescaped backtick/apostrophe, or a mis-counted paren breaks the whole script). Keeping the prose in a file makes the script tiny and robust, and gives a single inspectable/editable source of truth. The report template already lives on disk at `.specledger/templates/review-report-template.md` — reviewers/merge read it rather than re-deriving the format. *(Scaffolding files use a `_` prefix; offer to delete them after the run.)* +5. **Author + launch the Workflow** below (it just hands agents the on-disk paths). +6. When it returns, **present the merged report**. Then **offer to save** it (final step). + +## Skill loading is mandatory (not optional) + +> The reviewer brief MUST begin with a **`SKILLS:` line** naming the skills to invoke via the `Skill` tool and apply *before* reviewing. Design artifacts say *what* to build; the skills carry *how this repo designs it* — relying on the artifacts alone leaves that on the table. Workflow subagents have the `Skill` tool; do **not** distill skill content into the brief by hand and do **not** assume an agent will load a skill unprompted. + +## Workflow pipeline (author this script) + +> Keep the script **minimal**: it reads the on-disk brief + report template (step 4) and passes their paths to agents. Do **not** embed long prose, globs (`/*`), or multi-paragraph strings — that is the parse-fragility this disk-based design exists to avoid. Use an explicit `for`-loop to build the thunks (clearer paren-balance than a nested `parallel(Array.from(...))` one-liner). + +``` +export const meta = { + name: 'verify-artifacts', + description: 'Cross-artifact verification (no tasks): N independent reviewers + merge, reading on-disk brief + template', + phases: [{ title: 'Review' }, { title: 'Merge' }], +} + +const FD = args.featureDir +const N = args.reviewers || 2 +const MODEL = args.model // undefined → inherit launcher; or set from the AskUserQuestion answer +const BRIEF = FD + '/reviews/_reviewer-brief.md' +const TEMPLATE = '.specledger/templates/review-report-template.md' + +const FINDINGS = { type: 'object', required: ['findings'], properties: { + findings: { type: 'array', items: { type: 'object', + required: ['category', 'severity', 'location', 'summary', 'recommendation'], + properties: { + category: { type: 'string' }, // Coverage|Traceability|Consistency|DecisionIntegrity|Constitution|Ambiguity + severity: { enum: ['CRITICAL', 'HIGH', 'MEDIUM', 'LOW', 'INFO'] }, + location: { type: 'string' }, // file:line(s) or section refs + summary: { type: 'string' }, + recommendation: { type: 'string' } } } }, + coverageGaps: { type: 'array', items: { type: 'string' } }, // FR-*/SC-* with no downstream coverage + staleWording: { type: 'array', items: { type: 'string' } } } } + +// Phase 1 — N INDEPENDENT reviewers (parallel, fresh context each), schema'd findings. +phase('Review') +const thunks = [] +for (let i = 0; i < N; i++) { + const pass = i + 1 + thunks.push(() => agent( + 'Read ' + BRIEF + ' FIRST and follow it exactly (load the SKILLS it names via the Skill tool before reviewing, obey the read-only rule, read every artifact it lists). You are INDEPENDENT reviewer pass ' + pass + ' of ' + N + '. Perform the four-focus-area cross-verification described in the brief and return findings per the StructuredOutput schema, citing file:line. Edit nothing.', + { schema: FINDINGS, model: MODEL, phase: 'Review', label: 'reviewer#' + pass })) +} +const passes = (await parallel(thunks)).filter(Boolean) + +// Phase 2 — Merge: keep complementary findings, dedup overlap, reconcile severity, fill the on-disk template. +phase('Merge') +const report = await agent( + 'Read the report template at ' + TEMPLATE + ' and emit a single filled copy as your output, following its structure EXACTLY. Merge ' + passes.length + ' independent review passes: KEEP complementary findings, DEDUP true overlaps (same location+claim), reconcile each severity to the highest justified. Fill the coverage table (one row per FR/SC), the decision-integrity checklist, the metrics, and next actions. STRICTLY READ-ONLY. The passes as schema JSON: ' + JSON.stringify(passes), + { model: MODEL, phase: 'Merge', label: 'merge' }) + +return { report, reviewers: passes.length } +``` + +Pass `args: { featureDir: "", reviewers: , model: "" }`. + +## Report format + +The merge agent fills the on-disk template at **`.specledger/templates/review-report-template.md`** (findings table → coverage summary → decision integrity → metrics → next actions). Present that filled report to the user. + +## Final step — offer to save (explicit, opt-in write) + +After presenting the report, **AskUserQuestion**: save to `FEATURE_DIR/reviews/-review.md`? If yes, write the report with YAML frontmatter (`date`, `total_requirements`, `total_tasks: 0`, `coverage_pct`, `critical_issues`), creating `reviews/` if needed, and confirm the path. If a review already exists, offer to **merge into it** (mark resolved/open) rather than overwrite blindly. diff --git a/.agents/skills/qodo-manage-rules/README.md b/.agents/skills/qodo-manage-rules/README.md new file mode 100644 index 0000000..d7934f4 --- /dev/null +++ b/.agents/skills/qodo-manage-rules/README.md @@ -0,0 +1,75 @@ +# qodo-manage-rules + +The single skill for working with your org's **Qodo coding rules** — both *consuming* them +(load the rules relevant to a coding task and apply them while you code) and *administering* +them (modify / scope / deactivate a rule when it's wrong, over-broad, or stale). + +See [`SKILL.md`](./SKILL.md) for the full workflow. This README covers **provenance** and +**setup** only. + +## Why this exists (and what it replaces) + +It replaces the upstream **`qodo-get-rules`** skill, which was vendored from +[`github.com/qodo-ai/qodo-skills`](https://github.com/qodo-ai/qodo-skills) — an +**abandoned** repository. That skill was both incomplete (read-only; no way to manage +rules) and **broken in practice**: it read the API token from `~/.qodo/config.json`'s +`API_KEY` field, which the current Qodo CLI does not write there. Rather than fork a dead +upstream, this skill is self-contained: + +- correct auth (reads `~/.qodo/auth.key`, the file the Qodo CLI actually writes), +- adds the rule-management API (list / search / get / modify / deactivate), reverse-engineered + from the web portal and verified live (see [`references/api-contract.md`](./references/api-contract.md)), +- folds the "load rules for a coding task" job back in, done right (see + [`references/loading-rules.md`](./references/loading-rules.md)). + +**Use this skill instead of `qodo-get-rules`.** The old one has been removed from this repo's +`skills-lock.json`. + +## Setup — getting the API token + +The skill authenticates with a raw `sk-...` bearer token at **`~/.qodo/auth.key`** +(or the `QODO_API_KEY` env var, which takes precedence). + +The fastest way to mint that file is the **Qodo Command** CLI +([docs.qodo.ai/qodo-command](https://docs.qodo.ai/qodo-command)): + +```sh +qodo login # opens a browser, authenticates, and writes ~/.qodo/auth.key +``` + +> ⚠️ **Qodo Command is itself sunset.** Its *chat* is dead — any chat message just returns a +> "this tool is sunset" notice. But `qodo login` still works and is the easiest way to +> create the API key. After login you don't need the CLI again; this skill talks to the +> rules API directly. (If you already have a token from the Qodo web portal, you can skip +> the CLI entirely and write it to `~/.qodo/auth.key` yourself, or export `QODO_API_KEY`.) + +Verify it's in place (the token is 90 bytes, starts `sk-`): + +```sh +ls -lah ~/.qodo/auth.key +``` + +## Configuration + +| What | Source (highest precedence first) | Default | +|------|-----------------------------------|---------| +| **Token** | `$QODO_API_KEY` → `~/.qodo/auth.key` | — (required) | +| **API base** | `$QODO_API_URL` (or `QODO_API_URL` in `config.json`) → `$QODO_ENVIRONMENT_NAME` | `https://qodo-platform.qodo.ai/rules/v1` | + +`~/.qodo/config.json` holds only Qodo CLI **UI preferences** (theme, etc.) — the token is +**not** there. Don't commit `~/.qodo/auth.key` or any HAR capture; this repo's `.gitignore` +already excludes `*.key` and `*.har`. + +## Layout + +``` +qodo-manage-rules/ +├── README.md # you are here — provenance + setup +├── SKILL.md # the workflow (consume + administer) +├── references/ +│ ├── loading-rules.md # structured queries, two-query strategy, severity +│ ├── managing-derived-rules.md # modify-vs-deactivate decision tree + PR-triage playbook +│ └── api-contract.md # endpoints, auth, request/response shapes +└── scripts/ + └── qodo_rules.py # stdlib-only client (list/get/search/find/load/set-state/update) +``` diff --git a/.agents/skills/qodo-manage-rules/SKILL.md b/.agents/skills/qodo-manage-rules/SKILL.md new file mode 100644 index 0000000..da20273 --- /dev/null +++ b/.agents/skills/qodo-manage-rules/SKILL.md @@ -0,0 +1,119 @@ +--- +name: qodo-manage-rules +description: >- + The single skill for an org's Qodo coding rules — both CONSUMING them and ADMINISTERING + them. (1) LOAD the rules relevant to a coding task and apply them while writing code: + use whenever you're about to write, edit, refactor, or review code, or start planning an + implementation, and want to comply with the org's standards up front ("what rules apply + here", "load our coding rules", "check our conventions before I build this"). Skip if + rules are already loaded this session. (2) MANAGE the rule catalog — list, search, + inspect, modify (severity, content, scope), and deactivate/reactivate rules via the Qodo + rules API: use whenever a Qodo automated review flags something that is actually correct + or a deliberate decision and you want to fix the RULE not the code ("this Qodo rule is + wrong / over-broad / stale", "Qodo keeps flagging X", "the rule contradicts our + convention", "loosen / narrow / scope / carve-out that rule", "disable / deactivate / + turn off this rule", "change the rule from error to warning", "which rule produced this + review comment"), or to triage a declined PR review rule-by-rule. Trigger even when + "Qodo" isn't named, if the user is loading coding standards or reacting to an automated + review by changing the governing rule. This skill REPLACES the deprecated qodo-get-rules + skill (which had a broken auth lookup). NOT for posting PR comments. +--- + +# Qodo Rules + +Qodo coding rules are **org-wide**: every teammate's PR is graded against them. This skill +is the one place to work with them, in two modes: + +- **Consume** (`load`) — pull the rules relevant to what you're about to build and apply + them while coding, so you comply up front. *(This is the job the old `qodo-get-rules` + skill did; it's folded in here and fixed — see Auth.)* +- **Administer** (`list` / `find` / `get` / `update` / `set-state`) — change the rules + themselves when one is wrong, over-broad, or stale. Higher stakes: an edit changes what + everyone's review flags, so writes are dry-run by default. + +## The one tool + +All API plumbing is in `scripts/qodo_rules.py` (stdlib only, no deps). Use it rather than +hand-rolling curl — the write path is a **full-document PUT**, so the script does the +read-modify-write for you (a hand-built partial body would blank out other fields). + +```sh +S=.claude/skills/qodo-manage-rules/scripts/qodo_rules.py + +# CONSUME — load rules for the current task (apply while coding) +python3 $S load --scope /skillrig/cli/ \ + --query $'Name: \nCategory: \nContent: ' \ + --query $'Name: \nCategory: \nContent: ' + +# ADMINISTER — read +python3 $S list --all +python3 $S get 782313 +python3 $S search "verification offline" --scope /skillrig/cli/ +python3 $S find "go:build integration" # resolve ruleId(s) from review text → enriched + +# ADMINISTER — write (default DRY-RUN; add --apply to send the PUT) +python3 $S set-state 782685 inactive # deactivate (reversible — preferred over delete) +python3 $S set-state 782685 inactive --apply +python3 $S update 782313 --severity warning +python3 $S update 782313 --append-content "- Exception: the fetch layer is the feature." +python3 $S update 782313 --content-file /tmp/new_content.md --apply +``` + +Add `--json` for complete machine-readable output; `--limit N` bounds human rows. + +## Auth (differs from the old qodo-get-rules) + +The token is a raw `sk-...` bearer from **`~/.qodo/auth.key`** (or `$QODO_API_KEY`). It is +**not** in `~/.qodo/config.json` — that file is only UI prefs. The old `qodo-get-rules` +skill looked for `config.json:API_KEY`, which doesn't exist, so it was effectively broken in +this environment; that's the main reason this skill replaces it. The same bearer token does +both reads and writes (verified). The script never prints the token. + +## Mode 1 — load rules for a coding task + +Run `load` with two structured queries (a topic query + a cross-cutting query) right before +you write or plan code. It prints the relevant active rules grouped by severity; apply +ERROR (must), WARNING (should), RECOMMENDATION (consider). **Skip if rules are already +loaded this session** ("📋 Qodo Rules Loaded" in recent context). Empty result is valid — +proceed without constraints; never crash on no token / no network. + +Full query-writing guidance (the Name/Category/Content format, category selection, the +two-query strategy, scope, and the severity-application table) is in +**`references/loading-rules.md`** — read it before composing queries. + +## Mode 2 — manage a rule you disagree with + +When a Qodo finding is actually a correct/deliberate decision, fix the rule, not the code: + +1. **Find it.** `python3 $S find ""` → ruleId, severity, + state, and the `source` file it was derived from. +2. **Read it.** `python3 $S get ` — confirm the content matches what was enforced. +3. **Decide** (see `references/managing-derived-rules.md` for the decision tree): + - **Modify** when the intent is right but too broad/strict → narrow `content` (carve-out) + or downgrade `severity`. + - **Deactivate** (`set-state inactive`) when the rule shouldn't apply here at all (e.g. + derived from a generic vendored skill and conflicting with the project's own + convention). Reversible — preferred over deletion; this skill exposes no hard delete. +4. **Preview, then apply.** Writes default to a dry-run printing before/after. Confirm, then + re-run with `--apply`. +5. **Fix the source.** Qodo re-derives rules from files (`CLAUDE.md`, vendored `SKILL.md`s, + design docs). Update the `source` file too, or the rule comes back wrong next scan. + +## Safety + +- **Default to dry-run.** Pass `--apply` only after the diff is shown and clearly correct — + these edits are org-wide. +- **Prefer deactivate over delete.** No undo for a hard delete, so this skill doesn't offer + one; deactivation is a clean round-trip. +- **Confirm gate-weakening writes with the user** — severity downgrades on `error` rules and + deactivations weaken the org's gates; call that out. +- **Never paste the token**, commit it, or write it to a tracked file. + +## References + +- `references/loading-rules.md` — Mode 1: structured query format, two-query strategy, + scope, and severity application. +- `references/managing-derived-rules.md` — Mode 2: modify-vs-deactivate decision tree, the + "rules are derived from source files" model, and the worked PR-triage playbook. +- `references/api-contract.md` — endpoints, auth resolution, request/response shapes, and + the list-vs-search schema gotcha (`ruleId` vs `id`). diff --git a/.agents/skills/qodo-manage-rules/evals/trigger-eval-set.json b/.agents/skills/qodo-manage-rules/evals/trigger-eval-set.json new file mode 100644 index 0000000..faed121 --- /dev/null +++ b/.agents/skills/qodo-manage-rules/evals/trigger-eval-set.json @@ -0,0 +1,23 @@ +[ + {"query": "I'm about to add a `bump` subcommand to our skillrig Go CLI under internal/cli. Before I start, pull up the org's coding rules that apply to this repo so I write it compliant from the get-go.", "should_trigger": true}, + {"query": "before i refactor config.ResolveOrigin can you load whatever qodo coding standards we have for /skillrig/cli/", "should_trigger": true}, + {"query": "starting work on the TestQuickstart_ integration tests in test/ — what conventions does our team enforce that I should follow up front?", "should_trigger": true}, + {"query": "qodo's review keeps flagging FetchSkill for doing an HTTPS clone, but network fetch is literally what the `add` command does. can we fix the rule instead of the code?", "should_trigger": true}, + {"query": "that //go:build integration tag rule is wrong for us — we separate integration tests by the test/ directory, no build tag. turn the rule off.", "should_trigger": true}, + {"query": "the rule that blocks introducing new go modules is too strict now that we deliberately approved yaml.v3 — downgrade it from error to warning", "should_trigger": true}, + {"query": "which qodo rule generated this PR comment about 'verification must be offline and deterministic'? i want to narrow its scope so it stops hitting the fetch layer", "should_trigger": true}, + {"query": "our automated reviewer declined PR #8 for three reasons that are all deliberate, documented decisions. help me go through each and decide which governing rules to modify or deactivate.", "should_trigger": true}, + {"query": "deactivate qodo rule 782685, it conflicts with the test-layout convention in our CLAUDE.md", "should_trigger": true}, + {"query": "this org coding rule is stale — it predates the network boundary we added in 003. edit its content to add a carve-out exception for the clone/sparse-fetch path.", "should_trigger": true}, + + {"query": "commit this branch, push it, open a PR with the gh cli, and wait for the qodo bot's review to land", "should_trigger": false}, + {"query": "take the qodo review findings from PR #8 and post them as inline comments on the github pull request", "should_trigger": false}, + {"query": "review my current git diff for correctness bugs and flag anything risky before I push", "should_trigger": false}, + {"query": "pull in the terraform-review skill from our skills library and pin it to v1.4.0", "should_trigger": false}, + {"query": "run the check that the vendored skills in .agents/skills haven't been tampered with", "should_trigger": false}, + {"query": "write a Go function that parses the YAML frontmatter out of a SKILL.md file and returns the metadata.x-skillrig map", "should_trigger": false}, + {"query": "explain the GDPR data-retention obligations for our EU customer records", "should_trigger": false}, + {"query": "golangci-lint is erroring on an unused variable in internal/cli/root.go, what's the proper way to silence it", "should_trigger": false}, + {"query": "rename the field repoPath to originPath everywhere in the internal/config package", "should_trigger": false}, + {"query": "how do I point the qodo CLI at a different model in ~/.qodo/config.json?", "should_trigger": false} +] diff --git a/.agents/skills/qodo-manage-rules/references/api-contract.md b/.agents/skills/qodo-manage-rules/references/api-contract.md new file mode 100644 index 0000000..141c9cb --- /dev/null +++ b/.agents/skills/qodo-manage-rules/references/api-contract.md @@ -0,0 +1,86 @@ +# Qodo Rules API — contract + +Reverse-engineered live (the management calls are not in any published doc) and verified +with a real `auth.key` bearer token on 2026-05-31. + +## Base URL resolution + +Priority, highest first: +1. `$QODO_API_URL` (or `QODO_API_URL` in `~/.qodo/config.json`) → `{value}/rules/v1` +2. `$QODO_ENVIRONMENT_NAME` → `https://qodo-platform.{env}.qodo.ai/rules/v1` +3. default → `https://qodo-platform.qodo.ai/rules/v1` + +## Auth + +- `Authorization: Bearer ` where the token is a raw `sk-...` string from + **`~/.qodo/auth.key`** (or `$QODO_API_KEY`). **Not** `config.json:API_KEY` — that file + holds only UI prefs (theme, diffDisplay, …). +- The browser portal authenticates the same calls via a session cookie (Chrome's HAR + export strips both the `Authorization` *and* `Cookie` header values, so a captured HAR + shows neither — don't conclude the call is unauthenticated). The bearer token is + accepted for **both reads and writes** — confirmed by a live `PUT` returning `200`. +- Attribution headers used by this skill: `request-id` (a fresh UUID per call) and + `qodo-client-type: skill-qodo-manage-rules` (the portal sends `portal`). + +## Endpoints + +| Op | Method + path | Notes | +|----|---------------|-------| +| List | `GET /rules?page=N` | Paginated; returns `{page, totalCount, rules[]}`. 50 per page. Full rule schema. | +| Get one | `GET /rule/{ruleId}` | Full rule schema. Note **singular** `/rule/` (the list is plural `/rules`). | +| Search | `POST /rules/search` | Semantic. Body `{query, top_k, scopes?}`. **Sparse** result shape — see gotcha. | +| Update | `PUT /rule/{ruleId}` | **Full-document replace.** Used for content/severity/scope edits AND (de)activation via `state`. | + +No hard-delete endpoint is used by this skill. A `DELETE /rule/{ruleId}` very likely +exists (same `/rule/{id}` shape) but is intentionally not wired up — deactivation via +`PUT … "state":"inactive"` is the reversible equivalent. + +## ⚠️ Schema gotcha: list/get vs search + +These two return **different shapes** for the same rule: + +- **List / Get** → rich object. Id key is **`ruleId`** (int). Keys include: + `ruleId, name, category, severity, state, content, goodExamples, badExamples, + source, sourceType, sourceUri, sourceUris, scopes, suggestionType, insights, + similaritiesCount, url, createdAt, updatedAt`. +- **Search** → sparse object: only `{id, name, content}`. Id key is **`id`** (int), and + there is **no** severity/state/source. To act on a search hit you must `get` it for the + full record. `qodo_rules.py` normalizes the id key via `_rid()` and `find` auto-enriches + with a follow-up `GET`. + +## PUT body — full-document replace + +A `PUT` must carry the whole mutable document; omitted fields are wiped. Send only the +**server-accepted mutable fields** (everything else is server-managed and rejected/ignored): + +``` +name, category, severity, content, goodExamples, badExamples, sourceUri, scopes, state +``` + +Drop these from the body (server-managed): `ruleId` (it's in the URL), `createdAt`, +`updatedAt`, `source`, `sourceType`, `sourceUris`, `suggestionType`, `insights`, +`similaritiesCount`, `url`. + +Correct pattern (what the script does): `GET /rule/{id}` → keep mutable fields → mutate → +`PUT /rule/{id}`. Response echoes the updated full rule with a fresh `updatedAt`. + +### Field value vocabularies (observed) +- `severity`: `error` | `warning` | `recommendation` +- `state`: `active` | `inactive` +- `category`: e.g. `Security, Correctness, Quality, Reliability, Performance, Testability, + Compliance, Accessibility, Observability, Architecture` +- `scopes`: list of repo-path globs like `/skillrig/cli/` (narrows where the rule applies) + +## Worked example — the `PUT` that deactivated rule 782313 + +``` +PUT https://qodo-platform.qodo.ai/rules/v1/rule/782313 +Content-Type: application/json +Authorization: Bearer sk-... + +{ "name": "...", "category": "Architecture", "severity": "warning", + "content": "...", "goodExamples": "...", "badExamples": "...", + "sourceUri": "skillrig/cli/CLAUDE.md", "scopes": ["/skillrig/cli/"], + "state": "inactive" } ← the only changed field +→ 200, body = full updated rule with new updatedAt +``` diff --git a/.agents/skills/qodo-manage-rules/references/loading-rules.md b/.agents/skills/qodo-manage-rules/references/loading-rules.md new file mode 100644 index 0000000..24c6c02 --- /dev/null +++ b/.agents/skills/qodo-manage-rules/references/loading-rules.md @@ -0,0 +1,77 @@ +# Loading rules for a coding task (`load`) + +This is the **consume** path — pull the org's rules relevant to what you're about to build, +so you write code that already complies instead of getting flagged later. (It replaces the +deprecated `qodo-get-rules` skill, whose auth lookup was broken — see SKILL.md.) + +```sh +S=.claude/skills/qodo-manage-rules/scripts/qodo_rules.py +python3 $S load --scope /skillrig/cli/ \ + --query $'Name: \nCategory: \nContent: ' \ + --query $'Name: \nCategory: \nContent: ' +``` + +The script merges the queries (first = priority), dedupes, enriches each hit with its +severity/state (the search endpoint is sparse), keeps only **active** rules, and prints +them grouped by severity. Empty result is valid — proceed without constraints. + +## When to run it + +- Right before you start writing/refactoring code, or when planning an implementation. +- **Skip if rules are already loaded** this session (look for "📋 Qodo Rules Loaded" in + recent context) — re-running wastes calls and clutters context. +- Needs a token (`~/.qodo/auth.key` or `$QODO_API_KEY`). No token / no network → say so and + proceed gracefully; don't crash. + +## Why structured queries (not keywords) + +Each rule is embedded as a three-line vector — `Name` / `Category` / `Content`. Mirror that +exact shape so the query aligns on all three dimensions. Keyword lists and flat sentences +retrieve poorly. + +``` +Name: {5–10 word title of the rule this task would trigger} +Category: {one of the categories below} +Content: {1–2 sentences (≥15 words) on what to check; name the tech stack if known} +``` + +**Categories:** `Security, Correctness, Quality, Reliability, Performance, Testability, +Compliance, Accessibility, Observability, Architecture`. + +Picking the category: +- Prefer `Security` if security is plausibly in scope (highest cost if missed). +- Don't default to `Correctness` — it's over-used. Structural work → `Architecture`; + style/naming → `Quality`; availability/error-handling → `Reliability`; logging/metrics → + `Observability`; speed/memory → `Performance`. +- If a topic query returns <3 rules, broaden `Content` with adjacent concepts (e.g. auth → + "token validation, credential handling, session management"). + +## Two-query strategy + +Always pass **two** `--query` blocks: +1. **Topic query** — the task's primary concern (most specific Category + Content). +2. **Cross-cutting query** — recurring standards that apply to most changes (module + structure, error handling, logging, testing conventions). Use `Architecture`, + `Observability`, or `Security` as the Category. + +A single topic query misses the cross-cutting rules; two give the broadest useful coverage. + +## Scope + +`--scope` narrows to rules tagged for a repo path, e.g. `/skillrig/cli/`. These are Qodo's +own scope strings (visible in any rule's `scopes` field), **not** necessarily your git +`org/repo`. When unsure, omit `--scope` for an org-wide search — narrowing is an +optimization, not a requirement. + +## Applying what you get + +| Severity | Enforcement | +|---|---| +| **ERROR** | Must comply. If you genuinely can't, flag it to the user rather than silently violating. | +| **WARNING** | Comply by default; if you deviate, say why in your response. | +| **RECOMMENDATION** | Consider; apply when it fits. | + +After coding, briefly note which rules you applied and any WARNING you consciously skipped. +If a rule itself looks wrong for the task (over-broad, stale, conflicts with a documented +convention), that's the cue to switch to the **manage** path — see +`references/managing-derived-rules.md`. diff --git a/.agents/skills/qodo-manage-rules/references/managing-derived-rules.md b/.agents/skills/qodo-manage-rules/references/managing-derived-rules.md new file mode 100644 index 0000000..f098581 --- /dev/null +++ b/.agents/skills/qodo-manage-rules/references/managing-derived-rules.md @@ -0,0 +1,61 @@ +# Managing derived rules — decision tree & PR-triage playbook + +## Rules are derived from source files + +Qodo doesn't invent rules from nothing — it **derives** them by scanning repo files +(`CLAUDE.md`, vendored `SKILL.md`s, design docs) and distilling each into a rule with a +`source` / `sourceUri` pointing back at the file. Two consequences: + +1. **A rule can go stale relative to its source.** If the source file evolves (e.g. the + codebase legitimately grows a network/fetch layer), an old rule derived from an earlier + version can keep enforcing the outdated intent. +2. **Deactivating a rule is not permanent if you leave the source wrong.** The next scan + can re-derive it. So when you fix a rule, also fix the file it came from — otherwise it + comes back, and humans reading the source still see the stale rule. + +## Decision tree: a Qodo finding is wrong — now what? + +``` +Is the rule's intent still valid, just mis-applied here? +├── YES → MODIFY the rule +│ ├── Too broad / catches a legitimate case → narrow `content` (add a carve-out +│ │ exception) or add/adjust `scopes`. +│ └── Right check, wrong weight → change `severity` (e.g. error → warning). +│ └── Then: update the SOURCE file so the narrowed intent is documented there too. +│ +└── NO → the rule shouldn't govern this repo at all + ├── It conflicts with this project's own convention, and a *different* rule + │ already encodes the right convention → DEACTIVATE the wrong rule + │ (`set-state inactive`). Prefer this over delete (reversible). + └── Then: if the wrong rule was derived from a file in THIS repo (e.g. a vendored + generic skill), note that the source will keep re-deriving it; consider + de-scoping the source or documenting the project override in CLAUDE.md. +``` + +Always **preview the dry-run, get user confirmation for org-wide writes, then `--apply`**. +Downgrading an `error` and deactivating both weaken a shared gate — say so explicitly. + +## Worked playbook: the three PR #8 declines (skillrig 003) + +This is the canonical example the skill was built around. A Qodo review declined PR #8 for +three reasons, all of which were actually correct/deliberate decisions: + +| # | Finding | Rule (`find` it) | Right action | +|---|---------|------------------|--------------| +| 1 | Unapproved `yaml.v3` dependency | the "no new Go deps" rule (already **deleted** by the maintainer) | Resolved by deletion. The dep was user-approved + documented (CLAUDE.md, spec A7). Optionally re-create later as an *allowlist* that includes `yaml.v3`, but YAGNI under the pre-release "no backward-compat" marker. | +| 2 | `FetchSkill` does network (HTTPS clone) | **782313** *Disallow runtime network access in application code* (warning, `/skillrig/cli/`) — stale: CLAUDE.md now documents a deliberate fetch/network boundary | **MODIFY**: append a carve-out — the fetch layer used by `add`/`search` is the feature; only `verify`/`lint` must stay offline. Leave the narrow sibling **783461** *Verification commands must be fully offline* untouched — it's correct. Update CLAUDE.md if the carve-out isn't already explicit there. | +| 3 | Integration test lacks `//go:build integration` | **782685** *Integration tests must use //go:build integration build tag* (error, derived from the **vendored** `golang-testing/SKILL.md`) | **DEACTIVATE**: it conflicts with the project's deliberate dir-based convention, already encoded by **782315** + **783442** (integration = `test/` dir, `TestQuickstart_*`, no tag). Qodo declined this same rule on the 002 PR — it keeps re-flagging. The generic vendored skill shouldn't outrank the project's own CLAUDE.md. | + +Commands for #2 and #3 (dry-run first, then `--apply`): + +```sh +S=.claude/skills/qodo-manage-rules/scripts/qodo_rules.py +# #2 — narrow the over-broad network rule +python3 $S update 782313 --append-content \ + "- Exception: the fetch layer (clone/sparse-fetch in add/search to pull from a remote origin) is the feature and legitimately performs network I/O. This rule applies only to verify/lint code paths, which must stay offline." +# #3 — deactivate the build-tag rule that conflicts with the dir-based convention +python3 $S set-state 782685 inactive +``` + +The general lesson: **don't relitigate a correct decision in the PR thread — fix the rule +(and its source) so the gate reflects reality for everyone.** diff --git a/.agents/skills/qodo-manage-rules/scripts/qodo_rules.py b/.agents/skills/qodo-manage-rules/scripts/qodo_rules.py new file mode 100644 index 0000000..8db96d4 --- /dev/null +++ b/.agents/skills/qodo-manage-rules/scripts/qodo_rules.py @@ -0,0 +1,326 @@ +#!/usr/bin/env python3 +"""Manage Qodo coding rules (list / search / get / modify / deactivate). + +Read operations are unrestricted. WRITE operations (set-state, update) default to a +DRY-RUN that prints the before/after — pass --apply to actually send the PUT. This +matters because Qodo rules are org-wide: a careless edit changes what every teammate's +PR gets graded against. Deactivating (state=inactive) is the reversible alternative to +deleting; this tool deliberately offers no hard DELETE. + +Auth: bearer token from $QODO_API_KEY, else ~/.qodo/auth.key (a raw `sk-...` token). +NOTE: the token is NOT in ~/.qodo/config.json — that file is only UI prefs. + +The Qodo API is a FULL-DOCUMENT PUT: to change one field you must fetch the whole rule, +mutate it, and send the whole thing back. This script does that read-modify-write for +you so callers never hand-build a partial body (which would blank out other fields). +""" +from __future__ import annotations + +import argparse +import json +import os +import sys +import uuid +from urllib.error import HTTPError, URLError +from urllib.request import Request, urlopen + +# Fields the API accepts on a PUT. Everything else (ruleId, createdAt, updatedAt, +# source, sourceType, sourceUris, insights, similaritiesCount, url, suggestionType) +# is server-managed and must be dropped from the body. +MUTABLE_FIELDS = ( + "name", "category", "severity", "content", + "goodExamples", "badExamples", "sourceUri", "scopes", "state", +) + + +def resolve_token() -> str: + env = os.environ.get("QODO_API_KEY", "").strip() + if env: + return env + path = os.path.expanduser("~/.qodo/auth.key") + try: + with open(path) as fh: + tok = fh.read().strip() + except OSError: + sys.exit( + "ERROR: no Qodo token. Set $QODO_API_KEY or create ~/.qodo/auth.key " + "(a raw `sk-...` token). The token is NOT read from config.json." + ) + if not tok: + sys.exit(f"ERROR: {path} is empty.") + return tok + + +def resolve_base() -> str: + """{QODO_API_URL}/rules/v1, else ENVIRONMENT_NAME-based, else production.""" + api_url = os.environ.get("QODO_API_URL", "").strip() + cfg = os.path.expanduser("~/.qodo/config.json") + if not api_url and os.path.exists(cfg): + try: + with open(cfg) as fh: + api_url = (json.load(fh).get("QODO_API_URL") or "").strip() + except (OSError, json.JSONDecodeError): + api_url = "" + if api_url: + return f"{api_url.rstrip('/')}/rules/v1" + env = os.environ.get("QODO_ENVIRONMENT_NAME", "").strip() + if env: + return f"https://qodo-platform.{env}.qodo.ai/rules/v1" + return "https://qodo-platform.qodo.ai/rules/v1" + + +def _request(method: str, path: str, body: dict | None = None) -> dict: + base = resolve_base() + url = f"{base}{path}" + headers = { + "Authorization": f"Bearer {resolve_token()}", + "Accept": "application/json", + "request-id": str(uuid.uuid4()), + "qodo-client-type": "skill-qodo-manage-rules", + } + data = None + if body is not None: + data = json.dumps(body).encode() + headers["Content-Type"] = "application/json" + req = Request(url, data=data, headers=headers, method=method) + try: + with urlopen(req, timeout=30) as resp: + raw = resp.read() + except HTTPError as e: + detail = e.read().decode(errors="replace")[:500] + sys.exit(f"ERROR: {method} {path} -> HTTP {e.code}\n{detail}") + except URLError as e: + sys.exit(f"ERROR: network failure reaching Qodo ({e.reason}).") + return json.loads(raw) if raw else {} + + +# --- read ops --------------------------------------------------------------- + +def cmd_list(args) -> None: + rules, page, total = [], 1, None + while True: + d = _request("GET", f"/rules?page={page}") + rules += d.get("rules", []) + total = d.get("totalCount", len(rules)) + if not args.all or len(rules) >= total or not d.get("rules"): + break + page += 1 + _emit_rules(rules, total, args) + + +def cmd_get(args) -> None: + r = _request("GET", f"/rule/{args.rule_id}") + print(json.dumps(r, indent=2) if args.json else _fmt_rule(r)) + + +def cmd_search(args) -> None: + # NOTE: /rules/search returns a SPARSE shape — {id, name, content} only, + # no severity/state/source, and the id key is `id` not `ruleId`. _rid() + # normalizes it; use `get ` for the full record. + body = {"query": args.query, "top_k": args.top_k} + if args.scope: + body["scopes"] = [args.scope] + rules = _request("POST", "/rules/search", body).get("rules", []) + _emit_rules(rules, len(rules), args) + + +def cmd_find(args) -> None: + """Resolve a ruleId from free text (e.g. a Qodo PR-review comment). + + Combines semantic search with a literal substring scan over the full catalog, + so a half-remembered rule name still maps to its ruleId. Each candidate id is + then fetched via GET so the output carries severity/state/source uniformly + (search results alone are too sparse to act on). + """ + needle = args.text.lower() + ids: set[int] = set() + for r in _request("POST", "/rules/search", + {"query": args.text, "top_k": args.top_k}).get("rules", []): + ids.add(_rid(r)) + page, total = 1, 0 + while True: + d = _request("GET", f"/rules?page={page}") + for r in d.get("rules", []): + blob = (r.get("name", "") + " " + (r.get("content") or "")).lower() + if needle in blob: + ids.add(_rid(r)) + total = d.get("totalCount", 0) + if page * 50 >= total or not d.get("rules"): + break + page += 1 + full = [_request("GET", f"/rule/{rid}") for rid in ids if rid] + full.sort(key=lambda r: (r.get("state") != "active", r.get("name") or "")) + _emit_rules(full, len(full), args) + + +def cmd_load(args) -> None: + """Load the rules relevant to the CURRENT coding task, to apply while coding. + + This is the 'consume' path (what the deprecated qodo-get-rules skill did, fixed): + run one or two structured queries, merge with the first taking priority, dedupe, + enrich each hit via GET (search is sparse — no severity/state), keep only ACTIVE + rules, and print them grouped by severity. See references/loading-rules.md for how + to write the queries. + """ + order: list[int] = [] + for q in args.query: + body = {"query": q, "top_k": args.top_k} + if args.scope: + body["scopes"] = [args.scope] + for r in _request("POST", "/rules/search", body).get("rules", []): + rid = _rid(r) + if rid and rid not in order: + order.append(rid) + rules = [_request("GET", f"/rule/{rid}") for rid in order] + rules = [r for r in rules if r.get("state") == "active"] + if args.json: + print(json.dumps(rules, indent=2)) + return + print("# 📋 Qodo Rules Loaded\n") + if not rules: + print("No active rules matched this task. Proceeding without rule constraints.\n\n---") + return + print(f"{len(rules)} active rule(s), most relevant first. " + f"Apply by severity (error=must, warning=should, recommendation=consider):\n") + rank = {"error": 0, "warning": 1, "recommendation": 2} + for r in sorted(rules, key=lambda x: rank.get(x.get("severity"), 9)): + print(f"- **{r.get('name')}** [{(r.get('severity') or '?').upper()}] " + f"(rule {_rid(r)})\n {(r.get('content') or '').strip()}\n") + print("---") + + +# --- write ops (default dry-run) ------------------------------------------- + +def _put_with_preview(rule_id: int, current: dict, changes: dict, apply: bool) -> None: + body = {k: current.get(k) for k in MUTABLE_FIELDS if k in current} + print(f"\nRule {rule_id}: {current.get('name')}") + for k, new in changes.items(): + old = body.get(k) + if k == "content": + print(f" {k}: (content change — {len(str(old or ''))} -> {len(str(new))} chars)") + else: + print(f" {k}: {old!r} -> {new!r}") + body[k] = new + if not apply: + print("\nDRY-RUN. Re-run with --apply to send the PUT. " + "(Org-wide rule — this changes grading for every teammate.)") + return + res = _request("PUT", f"/rule/{rule_id}", body) + print(f"\nAPPLIED. state={res.get('state')} severity={res.get('severity')} " + f"updatedAt={res.get('updatedAt')}") + + +def cmd_set_state(args) -> None: + if args.state not in ("active", "inactive"): + sys.exit("ERROR: state must be 'active' or 'inactive'.") + current = _request("GET", f"/rule/{args.rule_id}") + if current.get("state") == args.state: + print(f"Rule {args.rule_id} already {args.state} — no-op.") + return + _put_with_preview(args.rule_id, current, {"state": args.state}, args.apply) + + +def cmd_update(args) -> None: + current = _request("GET", f"/rule/{args.rule_id}") + changes: dict = {} + if args.severity: + changes["severity"] = args.severity + if args.content_file: + with open(args.content_file) as fh: + changes["content"] = fh.read() + if args.append_content: + changes["content"] = (current.get("content") or "") + "\n" + args.append_content + if args.scope_add: + scopes = list(current.get("scopes") or []) + if args.scope_add not in scopes: + scopes.append(args.scope_add) + changes["scopes"] = scopes + if not changes: + sys.exit("ERROR: nothing to change. Pass --severity / --content-file / " + "--append-content / --scope-add.") + _put_with_preview(args.rule_id, current, changes, args.apply) + + +# --- formatting ------------------------------------------------------------- + +def _rid(r: dict) -> int | None: + """Normalize the id key: list/get use `ruleId`, search uses `id`.""" + return r.get("ruleId") or r.get("id") + + +def _fmt_rule(r: dict) -> str: + sev = r.get("severity") or "?" + state = r.get("state") or "?" + src = r.get("source") or r.get("sourceUri") or "(search result — run `get` for source)" + return f"[{_rid(r)}] ({sev}/{state}) {r.get('name')}\n source: {src}" + + +def _emit_rules(rules, total, args) -> None: + if args.json: + print(json.dumps(rules, indent=2)) + return + if not rules: + print("No matching rules.") + return + for r in rules[: args.limit]: + print(_fmt_rule(r)) + shown = min(len(rules), args.limit) + print(f"\n{shown} shown · {len(rules)} matched · {total} total in catalog") + if shown < len(rules): + print("(use --limit N or --json for more)") + + +def main() -> int: + p = argparse.ArgumentParser(description="Manage Qodo coding rules.") + p.add_argument("--json", action="store_true", help="raw JSON output") + p.add_argument("--limit", type=int, default=30, help="max rows in human output") + sub = p.add_subparsers(dest="cmd", required=True) + + lp = sub.add_parser("list", help="list rules (paginated)") + lp.add_argument("--all", action="store_true", help="fetch every page") + lp.set_defaults(func=cmd_list) + + gp = sub.add_parser("get", help="get one rule by id") + gp.add_argument("rule_id", type=int) + gp.set_defaults(func=cmd_get) + + sp = sub.add_parser("search", help="semantic search") + sp.add_argument("query") + sp.add_argument("--scope", help="e.g. /skillrig/cli/") + sp.add_argument("--top-k", type=int, default=20) + sp.set_defaults(func=cmd_search) + + fp = sub.add_parser("find", help="resolve ruleId from free text (semantic + substring)") + fp.add_argument("text") + fp.add_argument("--top-k", type=int, default=20) + fp.set_defaults(func=cmd_find) + + ld = sub.add_parser("load", help="load rules relevant to the current coding task (apply while coding)") + ld.add_argument("--query", action="append", required=True, + help="structured Name/Category/Content query; pass twice (topic + cross-cutting)") + ld.add_argument("--scope", help="e.g. /skillrig/cli/ (omit for org-wide)") + ld.add_argument("--top-k", type=int, default=20) + ld.set_defaults(func=cmd_load) + + stp = sub.add_parser("set-state", help="activate / deactivate a rule (reversible)") + stp.add_argument("rule_id", type=int) + stp.add_argument("state", help="active | inactive") + stp.add_argument("--apply", action="store_true", help="send the PUT (default: dry-run)") + stp.set_defaults(func=cmd_set_state) + + up = sub.add_parser("update", help="modify rule fields (read-modify-write PUT)") + up.add_argument("rule_id", type=int) + up.add_argument("--severity", help="error | warning | recommendation") + up.add_argument("--content-file", help="replace content with file contents") + up.add_argument("--append-content", help="append a line to content (e.g. a carve-out)") + up.add_argument("--scope-add", help="add a scope path") + up.add_argument("--apply", action="store_true", help="send the PUT (default: dry-run)") + up.set_defaults(func=cmd_update) + + args = p.parse_args() + args.func(args) + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/.agents/skills/skillrig/SKILL.md b/.agents/skills/skillrig/SKILL.md index 4e3f7cf..ddf6bea 100644 --- a/.agents/skills/skillrig/SKILL.md +++ b/.agents/skills/skillrig/SKILL.md @@ -2,16 +2,20 @@ name: skillrig description: >- Point a repository at your org's agent-skills library and manage vendored skills with the - `skillrig` CLI — bind/choose the origin (`init`), vendor/add a skill (`add`), and - verify/check that committed skills are exactly what was approved (`verify`). Use whenever - the user wants to find, install, add, vendor, pull in, lock, or pin an agent skill from a - skills library; set/configure where skills come from or fix a "no origin configured" error; - point a repo at a skills repo (OWNER/REPO[@branch]) or use SKILLRIG_ORIGIN; or verify / - check / audit that vendored skills haven't been tampered with (a CI gate) and debug - add/verify errors or `mismatch`/`orphan`/`missing`/`dirty` verdicts. Vendoring requires an + `skillrig` CLI — bind/choose the origin (`init`), search/discover skills in it (`search`), + vendor/add a skill (`add`, local or remote, with an optional immutable `--pin`), + verify/check that committed skills are exactly what was approved (`verify`), and generate + the origin's catalog (`index`). Use whenever the user wants to find, search, discover, + install, add, vendor, pull in, lock, or pin an agent skill from a skills library; filter + skills by topic; set/configure where skills come from or fix a "no origin configured" error; + point a repo at a skills repo (OWNER/REPO[@branch]) or use SKILLRIG_ORIGIN; fetch from a + private/remote origin or debug auth / unreachable / not-found / no-such-version fetch errors; + or verify / check / audit that vendored skills haven't been tampered with (a CI gate) and + debug `mismatch`/`orphan`/`missing`/`dirty` verdicts. Discovery and vendoring require an origin, so `skillrig init` comes first. Trigger even when the command isn't named — e.g. - "point this repo at our skills", "pull in the terraform-review skill", "make sure nobody - changed our skills", "why did the skills check fail in CI", "our agent skill got edited". + "point this repo at our skills", "what skills does our library have for terraform", "pull in + the terraform-review skill", "pin it to v1.4.0", "make sure nobody changed our skills", "why + did the skills check fail in CI", "our agent skill got edited". license: MIT metadata: author: skillrig @@ -32,40 +36,50 @@ recomputes it and fails if it drifted — same primitive on both sides, so the g ## When to use this skill -Use it whenever the user wants to **find / install / add / vendor / pull in** an agent skill -from a library, **set or fix the origin** ("point this repo at our skills", "no origin -configured"), or **verify / check / audit** that vendored skills are unmodified (a CI gate), -including debugging `add`/`verify` output. Three activities, three commands: +Use it whenever the user wants to **find / search / discover / install / add / vendor / pull +in** an agent skill from a library, **set or fix the origin** ("point this repo at our +skills", "no origin configured"), **fetch from a private/remote origin** (and debug +auth/unreachable/not-found errors), or **verify / check / audit** that vendored skills are +unmodified (a CI gate), including debugging command output: | Activity | Command | Read | |---|---|---| | Choose where skills come from (bind the origin) | `skillrig init` | [references/init.md](references/init.md) | -| Vendor a skill into the repo (+ lock its identity) | `skillrig add ` | [references/add.md](references/add.md) | +| Discover skills in the origin (search/filter by topic) | `skillrig search [QUERY...]` | [references/search.md](references/search.md) | +| Vendor a skill into the repo (local or remote; `--pin` a version) | `skillrig add ` | [references/add.md](references/add.md) | | Prove vendored skills match what was approved | `skillrig verify` | [references/verify.md](references/verify.md) | +| **Origin-side:** generate the origin's catalog (`index.json`) | `skillrig index` | [references/index.md](references/index.md) | -Load only the reference for the activity at hand. +Load only the reference for the activity at hand. `init`/`search`/`add`/`verify` run in a +**consumer** repo; `index` runs **inside the skills-library (origin) repo** (usually its CI). ## Prerequisite: an origin must be configured (run `init` first) -`add` needs to know **where** skills come from, so a configured origin is a precondition. -**Smoketest before vendoring:** is an origin resolvable? +`search` and `add` both need to know **where** skills come from, so a configured origin is a +precondition. **Smoketest before discovering/vendoring:** is an origin resolvable? - project: `.skillrig/config.toml` exists (at the git repo root), **or** - env: `$SKILLRIG_ORIGIN` is set, **or** - global: `~/.config/skillrig/config.toml` exists. Precedence (highest wins): `SKILLRIG_ORIGIN` > project config > global. If none resolve, -`add` fails with `no origin configured` — run `skillrig init --origin OWNER/REPO` first -(see [references/init.md](references/init.md)). **`verify` needs no origin** (it reads the -committed lock + tree, offline). +`search`/`add` fail with `no origin configured` — run `skillrig init --origin OWNER/REPO` +first (see [references/init.md](references/init.md)). **`verify` needs no origin** (it reads +the committed lock + tree, offline). + +**Remote vs local origin.** A bare `OWNER/REPO` origin is **fetched over `git`** by +`search`/`add` (a private one auto-resolves a read-only token via `GH_TOKEN` > `GITHUB_TOKEN` +> `gh auth token`, so `gh auth login` once is enough); a filesystem-**path** origin is read +locally with no network. The form is classified automatically — you don't pick a mode. ## The typical workflow ``` skillrig init --origin my-org/my-skills # 1. bind the origin (once per repo) -skillrig add terraform-plan-review # 2. vendor a skill into .agents/skills/ -git add -A && git commit -m "vendor skill" # 3. commit (verify checks committed content) -skillrig verify # 4. prove it matches the recorded version (CI gate) +skillrig search terraform # 2. discover an approved skill (its name/version) +skillrig add terraform-plan-review # 3. vendor it into .agents/skills/ (+ optional --pin v1.4.0) +git add -A && git commit -m "vendor skill" # 4. commit (verify checks committed content) +skillrig verify # 5. prove it matches the recorded version (CI gate) ``` Exit codes are load-bearing for CI/agents: `0` ok · `1` usage/config · `2` verification @@ -73,16 +87,15 @@ failure · `3` reserved (never emitted). Errors are what/why/fix on stderr; `--j complete machine view on stdout; `--verbose` shows the raw cause. Details per command in the references. -## Not here yet +## Scope vs. the generic `find-skills` skill -**Discovery / search is the next planned feature.** Listing or searching the *approved* -skills available in your configured origin (a `search`/index command) does **not exist yet** — -until it lands, vendor a skill by its **known name** with `add`. Note the scope: `skillrig`'s -"find" means *find an approved skill in **your origin***, which is distinct from the generic -`find-skills` skill (discovering skills from anywhere). So a request to "find/install a skill +`skillrig search`'s "find" means *find an approved skill in **your origin*** — distinct from +the generic `find-skills` skill (discovering skills from anywhere). So "find/install a skill from our library" is `skillrig`; "what skills exist out there for X?" is `find-skills`. -Also designed but **not implemented** (don't assume they exist): remote/network fetch + auth, -immutable per-skill `--pin`, multi-client symlink views, and a prerequisite/health `doctor` -(the reserved exit `3`). This slice is `init` + `add` (from a local origin checkout) + -`verify`. +## Not here yet + +Designed but **not implemented** (don't assume they exist): multi-client symlink views, a +prerequisite/health `doctor` (the reserved exit `3`), `bump --pr` upgrades, and `global` +scope. The shipped surface is `init` + `search` + `add` (local **or** remote, with `--pin`) + +`verify`, plus the origin-side `index` generator. diff --git a/.agents/skills/skillrig/references/add.md b/.agents/skills/skillrig/references/add.md index 921d763..503ee14 100644 --- a/.agents/skills/skillrig/references/add.md +++ b/.agents/skills/skillrig/references/add.md @@ -5,20 +5,27 @@ Vendors `` into the canonical `.agents/skills//`, byte-identical and mode-preserving (it injects nothing), and records its identity — `version`, `commit`, -`treeSha`, `path` — in `.skillrig/skills-lock.json`. Offline and consume-only. -**Requires a git repository** (project scope). The recorded `treeSha` is the git tree-SHA -`verify` later recomputes, so the two cannot drift (the gate cannot lie) — never hand-edit -the lock. +`treeSha`, `path` — in `.skillrig/skills-lock.json`. Consume-only (the fetch token is +read-only; there is no write/publish path). **Requires a git repository** (project scope). +The recorded `treeSha` is the git tree-SHA `verify` later recomputes, so the two cannot +drift (the gate cannot lie) — never hand-edit the lock. Path-traversal + symlink guards +apply to remotely-fetched content too. - **Origin, not a path**: `add` resolves the active origin via the shared resolver (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global). There is **no** `--from`/path argument. -- **Local origin (this release)**: the configured `OWNER/REPO` is read from a local git - checkout at `/OWNER/REPO` (resolved against the repo root, so `add` works from - **any subdirectory**) — no network. So `init --origin my-org/my-skills` expects that - library checked out at `/my-org/my-skills` (keep it out of your index, e.g. - `echo 'my-org/' >> .git/info/exclude`). If that checkout is absent, `add` says **"origin - checkout not found"** (distinct from "skill not found"). +- **Two origin forms, classified automatically** (you do not choose a mode): + - **Remote `OWNER/REPO`** (the common case) — `add` **fetches** the skill subtree directly + from `github.com/OWNER/REPO@ref` over `git` (sparse checkout). Nothing needs to be checked + out locally. A private origin needs a **read-only** token, resolved automatically via + `GH_TOKEN` > `GITHUB_TOKEN` > `gh auth token` (so `gh auth login` once is enough). + - **Local filesystem path** — if the configured origin is a real path, `add` reads that + local checkout (no network). This is the generalized 002 behavior. +- **`--pin ` — vendor a specific immutable version** (remote path). A bare semver + (`v1.4.0` / `1.4.0`) expands via the origin's `tag_scheme` to the per-skill tag + (`terraform-plan-review-v1.4.0`); anything else is a literal git ref/SHA. The lock records + the resolved `commit` + `treeSha` + the human-readable `version`/tag, so re-adding the same + pin reproduces byte-identical content. Omit `--pin` to vendor the origin branch's tip. - **Idempotent**: re-adding identical content reports success and changes nothing (`action: "unchanged"`). - **Never clobbers**: if the on-disk copy diverges from the recorded fingerprint, `add` @@ -32,6 +39,7 @@ the *committed* tree. | Flag | Purpose | |------|---------| +| `--pin ` | Vendor an immutable version: bare semver → origin `tag_scheme` tag; else literal git ref/SHA | | `--dry-run` | Report what would be vendored/recorded; write nothing | | `--force` | Overwrite a vendored skill whose on-disk content diverges from the lock | | `--json` | Emit the complete `AddResult` on stdout | @@ -44,21 +52,29 @@ the *committed* tree. 1. **Vendor + lock**: `skillrig add terraform-plan-review` → `git add -A && git commit` → `skillrig verify`. -2. **Recover a tampered skill** (a `mismatch`/`dirty` verdict from verify): re-vendor with +2. **Pin a version**: `skillrig add terraform-plan-review --pin v1.4.0` vendors that exact + reviewed version (reproducible via the recorded commit + treeSha + tag). +3. **Recover a tampered skill** (a `mismatch`/`dirty` verdict from verify): re-vendor with `skillrig add --force`, then commit and re-verify. -3. **Adopt an `orphan`**: `skillrig add ` records an on-disk-but-unlocked skill. -4. **Preview**: `skillrig add --dry-run`. +4. **Adopt an `orphan`**: `skillrig add ` records an on-disk-but-unlocked skill. +5. **Preview**: `skillrig add --dry-run`. ## Error handling | Symptom (stderr) | Cause | Fix | |------------------|-------|-----| | `no origin configured` | no `SKILLRIG_ORIGIN` / project / global origin | `skillrig init --origin OWNER/REPO`, or set `SKILLRIG_ORIGIN` | -| `origin checkout not found at ` | the configured `OWNER/REPO` is not checked out locally at `/OWNER/REPO` | clone the origin there (`git clone `), or re-bind with `skillrig init` | -| `skill "" not found in origin` | the origin IS present but has no `skills//` | check the name against the origin's `skills/` | +| `authentication ... failed` (**AuthError**) | private origin, no/invalid token | `gh auth login`, or export a `GH_TOKEN`/`GITHUB_TOKEN` with read access. **Not** a typo'd repo — the name resolved fine | +| `... is unreachable` (**UnreachableError**) | network failure / wrong host | check connectivity/proxy and the host in the origin reference; retry | +| `origin "" not found` (**NotFoundError**) | origin repo missing **or** a private repo with no token (GitHub reports both as "not found") | check the spelling; **if private, authenticate** (`gh auth login` / `GITHUB_TOKEN`) — that's the common cause | +| `no version "" of ""` (**NoSuchVersionError**) | a `--pin` that resolves to no tag/ref | list the published versions, or drop `--pin` to take the tip. Distinct from "skill not found" | +| `origin ... uses convention version N` (**IncompatibleConvention**) | origin speaks a layout this binary doesn't support | update `skillrig`, or point at a compatible origin | +| `skill "" not found in origin` | the origin IS reachable but has no `skills//` | check the name against the origin's catalog (`skillrig search`) | | `refusing to overwrite ` | on-disk content diverges from the record | re-run with `--force`, or revert local edits | | `not a git repository` | run outside a repo | run inside the repo (or `git init` first) | | bad/missing args | wrong invocation | the error states what/why/fix + an example; or `skillrig add --help` | -All failures state what/why/fix and exit `1`; add `--verbose` for the raw cause. Errors go to -stderr, data to stdout. +These failure classes are **distinct on purpose** — auth vs unreachable vs not-found vs +no-such-version each point at a different fix; don't treat them as one. All failures state +what/why/fix and exit `1`; add `--verbose` for the raw cause. Errors go to stderr, data to +stdout. diff --git a/.agents/skills/skillrig/references/index.md b/.agents/skills/skillrig/references/index.md new file mode 100644 index 0000000..a118115 --- /dev/null +++ b/.agents/skills/skillrig/references/index.md @@ -0,0 +1,63 @@ +# `skillrig index` — generate the origin's catalog (origin-side generator) + +> **This runs INSIDE the skills-library (origin) repo, not a consumer repo.** It is the +> producer of the `index.json` that [search](search.md) consumes. Most users never run it by +> hand — the origin's `index.yml` CI runs it on merge to `main`. + +`index` walks the origin's `skills/*/SKILL.md`, parses each skill's **YAML frontmatter** +(the same `ParseManifest` the consumer commands use — there is exactly one implementation), +and emits the catalog `index.json` at the origin root. The catalog is **discovery-only**: +per-skill `name`, `version`, `namespace`, `description`, `topics[]`, `path` (+ a `requires` +summary) and catalog-level `skillrigConvention` + `origin`. No per-skill tree-SHA/commit. + +## When and how it runs + +- **In origin CI**, on `push` to `main` touching `skills/**` (not "on release") — the + workflow regenerates `index.json` and commits it if it changed. +- **Locally**, an origin maintainer can run `skillrig index --out index.json` to regenerate + the catalog after adding/enriching a skill, then commit the result. + +``` +skillrig index --out index.json # regenerate the catalog from skills/*/SKILL.md frontmatter +skillrig index # print to stdout (no --out) +``` + +## Single-tip, full-regenerate + +The catalog reflects **only the branch tip** — one version per skill (the HEAD/latest +version). It is fully regenerated each run and accumulates nothing (no append, no GC). It is +**not** a version-history index: prior versions live in git tags, reached by +`skillrig add --pin `. A skill removed at HEAD correctly drops from `search`. + +## Required frontmatter (enrichment is a checked precondition) + +Each skill's `SKILL.md` must carry the skillrig block under `metadata.x-skillrig.*` — +`version`, `convention-version`, `topics`, `namespace`, and `requires` if it has backing +CLIs. `index` **fails clearly (exit 1)** on a skill missing the required `x-skillrig.version` +rather than silently emitting an under-populated catalog. (Standard agentskills.io +vendored skills carry only `name`/`description`, so this block must be added per skill.) + +| Flag | Purpose | +|------|---------| +| `--out ` | Write the catalog to `` (default: stdout) | +| `--json` | The catalog is JSON; this flag keeps parity across commands | +| `--verbose` | Show the raw underlying cause behind an error | + +## Exit codes + +| Code | When | +|------|------| +| `0` | Catalog generated successfully | +| `1` | Usage/config: a skill missing required `x-skillrig.version`, unreadable/invalid frontmatter, bad args, not run in an origin | + +`index` is local-filesystem + `git`-tree only — no network, no auth. It never emits `2`/`3`. + +## Error handling + +| Symptom (stderr) | Cause | Fix | +|------------------|-------|-----| +| `skill "" missing x-skillrig.version` | frontmatter not enriched | add the `metadata.x-skillrig.*` block to that `SKILL.md` | +| `cannot parse frontmatter in ` | malformed YAML frontmatter | fix the `SKILL.md` frontmatter | +| bad args | wrong invocation | the error states what/why/fix; or `skillrig index --help` | + +All failures state what/why/fix and exit `1`; `--verbose` shows the raw cause. diff --git a/.agents/skills/skillrig/references/search.md b/.agents/skills/skillrig/references/search.md new file mode 100644 index 0000000..797d625 --- /dev/null +++ b/.agents/skills/skillrig/references/search.md @@ -0,0 +1,73 @@ +# `skillrig search [QUERY...]` — discover skills in your origin (Query) + +> Find approved skills in your **configured origin** before vendoring one with [add](add.md). +> Needs an origin (run [init](init.md) first). Reads the origin's catalog (`index.json`). + +`search` reads the origin's generated catalog and lists the skills that match. It is the +**discovery** step of the `init → search → add → verify` loop — use it to learn a skill's +exact name and version before `skillrig add `. Scope: it finds an approved skill in +*your origin* (distinct from the generic `find-skills` skill, which discovers skills from +anywhere). + +## How matching works (deterministic — no fuzzy/semantic ranking) + +- **Free-text `[QUERY...]`** — case-insensitive **token-AND substring** over each skill's + `name` + `description` + `topics`. Every query token must match somewhere; multiple tokens + are ANDed. No query = list everything. +- **`--topic ` (repeatable)** — a separate **exact-string** filter applied after the text + match: keep only skills carrying topic ``. (`--topic`, not `--tag`/`--filter`.) +- **Order is deterministic** — a fixed relevance bucket (exact-name > name-match > + topic-match > description-match) then lexicographic by `name`. Same inputs → same output, + always (no TF-IDF, no inference). + +``` +skillrig search # list all skills in the origin +skillrig search terraform # text match on name/description/topics +skillrig search plan --topic aws # text match AND the 'aws' topic +skillrig search --topic platform-team # topic filter only +``` + +## Freshness & origin + +`search` fetches the origin's `index.json` **per call** (no local cache this slice), so +results always reflect the origin's current tip. It resolves the active origin through the +shared resolver (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global) and checks the +origin's convention version before reading. A **remote** origin is fetched over `git` (a +private one uses the auto-resolved read-only token); a **local-path** origin is read with no +network. + +## Output + +- **Human (default)** — one compact line per match (name, version, namespace, truncated + description, `requires` summary) + a footer hint pointing at `add`. An **empty match set is + a clean success (exit 0)** with a "no matches" hint, *not* an error. +- **`--json`** — the complete catalog entries, untruncated, pipeable: + `skillrig search terraform --json | jq '.[].name'`. + +| Flag | Purpose | +|------|---------| +| `--topic ` | Exact-string topic filter (repeatable), applied after the text match | +| `--json` | Emit the complete catalog entries on stdout | +| `--verbose` | Show the raw underlying cause behind a summary or error | + +## Exit codes + +| Code | When | +|------|------| +| `0` | Success — **including zero matches** (empty result is not a failure) | +| `1` | Usage/config: no origin configured, unreachable/auth/incompatible-convention fetching the catalog, bad args | + +`search` never emits `2`/`3` (those are reserved for verification/prerequisite gates). + +## Error handling + +| Symptom (stderr) | Cause | Fix | +|------------------|-------|-----| +| `no origin configured` | no resolvable origin | `skillrig init --origin OWNER/REPO`, or set `SKILLRIG_ORIGIN` | +| `... is unreachable` (**UnreachableError**) | network failure / wrong host | check connectivity/proxy/host; retry | +| `authentication ... failed` (**AuthError**) | private origin, no/invalid token | `gh auth login`, or export `GH_TOKEN`/`GITHUB_TOKEN` | +| `origin "" not found` (**NotFoundError**) | origin missing, or private with no token | check spelling; **if private, authenticate** | +| `origin ... uses convention version N` | origin layout unsupported by this binary | update `skillrig` or use a compatible origin | + +All failures state what/why/fix and exit `1`; `--verbose` shows the raw cause. Errors to +stderr, data to stdout (so `skillrig search --json 2>/dev/null | jq .` stays clean). diff --git a/.agents/skills/skillrig/references/verify.md b/.agents/skills/skillrig/references/verify.md index fac89d8..db01300 100644 --- a/.agents/skills/skillrig/references/verify.md +++ b/.agents/skills/skillrig/references/verify.md @@ -23,9 +23,9 @@ first failure) and takes no arguments. Two checks: ## CRITICAL: verify is integrity-only — a missing backing tool is NOT a failure -`verify` does **no** prerequisite/eligibility check. A skill may declare `[[requires]]` backing -tools in its `skill.toml`; if those tools are absent in the environment, `verify` **still -passes** (it checks content, not runnability). Prerequisite checking is a future `doctor` +`verify` does **no** prerequisite/eligibility check. A skill may declare backing tools in its +`SKILL.md` frontmatter (`metadata.x-skillrig.requires`); if those tools are absent in the +environment, `verify` **still passes** (it checks content, not runnability). Prerequisite checking is a future `doctor` concern (the reserved exit `3`), never emitted here. Don't tell a user that verify failed because a tool isn't installed — that's never the cause. diff --git a/.gitignore b/.gitignore index 1ebd1fd..92e77d8 100644 --- a/.gitignore +++ b/.gitignore @@ -13,3 +13,8 @@ __pycache__/ # OS / editor cruft .DS_Store *.swp + +# Secrets / captured traffic — NEVER commit (Qodo API token; HAR dumps may embed bearer tokens) +qodo-auth.key +*.key +*.har diff --git a/.specledger/memory/constitution.md b/.specledger/memory/constitution.md index 0f852fd..ee82bf3 100644 --- a/.specledger/memory/constitution.md +++ b/.specledger/memory/constitution.md @@ -1,5 +1,15 @@ + +| ID | Source(passes) | Category | Severity | Location(s) | Summary | Recommendation | +|----|----------------|----------|----------|-------------|---------|----------------| +| C1 | r1,r2 | Consistency | HIGH | spec.md:FR-0XX ↔ contracts/:LN | | | + +### Coverage summary + + + +| Requirement | Plan | Contract | Quickstart test | Status | +|-------------|------|----------|-----------------|--------| +| FR-001 | | | | Covered / Gap | + +### Decision integrity + + + +- → ✓ / stale at + +### Metrics + +- Requirements: FR + SC · Reviewers: · Critical: · High: · Medium/Low/Info: · Coverage gaps: + +### Next actions + +- Resolve CRITICAL/HIGH before implementation; LOW/MEDIUM may proceed with noted improvements. diff --git a/CLAUDE.md b/CLAUDE.md index 4851af1..1910e2c 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -8,6 +8,8 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co `skillrig` is a single, generic, **consume-only** Go CLI for pointing a repo (or a per-user default) at an **origin** — the `OWNER/REPO[@REF]` that hosts an org's agent skills — and managing vendored skills from it. The same binary serves humans, agents, and CI. There is no `publish`/`login` and no write credential in the binary: GitHub is the authority plane ("publishing" = a PR to the origin). +> **DEPRECATED — the sibling `skill.toml` manifest.** As of **003 (spike S1)**, a skill's machine metadata lives in its **`SKILL.md` YAML frontmatter** following the [agentskills.io](https://agentskills.io) standard — standard keys (`name`, `description`, `license`, …) at top level, and skillrig-specific data (`version`, `namespace`, `convention-version`, `topics`, `requires`) under the standard's free-form `metadata` map, namespaced as **`metadata.x-skillrig.*`** (parsed with `gopkg.in/yaml.v3`). The old `skill.toml` sibling file is **removed**; do not reintroduce it. Likewise the historical `[[requires]]` TOML notation in `docs/ARCHITECTURE-v0.md` now means **`metadata.x-skillrig.requires`** (a YAML list) in the frontmatter. `go-toml/v2` is retained ONLY for `.skillrig/config.toml`/`.skillrig-origin.toml`, never for skill manifests. + Two design documents are binding and override general instincts: - `.specledger/memory/constitution.md` — development principles (spec-first, quickstart-as-contract, YAGNI, skill–CLI co-evolution). - `docs/design/cli.md` — the CLI design contract (progressive discovery, errors-as-navigation, two-level output, standard flags, exit codes, command-pattern classification). A CLI behavior change must update this file in the same branch. @@ -71,10 +73,9 @@ Features follow SpecLedger: **Specify → Clarify → Plan → Tasks → Review ## Active Technologies -- Go 1.24+ (toolchain 1.24.4) — single static binary. -- Go standard `go test`, two tiers (Constitution II/III): (a) **unit** — table-driven `skillcore` tests + a **ground-truth** test asserting `skillcore.TreeSHA` equals real `git` tree output; (b) **integration** — `TestQuickstart_*` build + exec the real binary over a fixture origin bootstrapped in a tmpDir. **No network boundary this slice → no `httptest`/go-vcr** (that tier arrives with remote `add`). -- `github.com/pelletier/go-toml/v2` (config + `skill.toml` parse); lock uses stdlib `encoding/json`. **No new dependencies -- and no in-process hashing dependency** — the tree-SHA is obtained by *shelling `git`* (see Runtime dependency + research). `go-getter` is explicitly *not* adopted this slice (acquisition is a local origin; OQ-3 deferred). Deps kept minimal (consume-only static binary). -- existing only — `github.com/spf13/cobra` (command tree) -- local files only — vendored subtree under `.agents/skills//` (canonical, committed), `.skillrig/skills-lock.json` (committed, tool-written, atomic). `add` reads the resolved origin (a local path this slice). No database, no network. +- Go 1.24+ (toolchain 1.24.4) — single static binary (unchanged). +- `github.com/spf13/cobra` (commands); `github.com/pelletier/go-toml/v2` (config + retained for `.skillrig/config.toml`); **NEW: `gopkg.in/yaml.v3`** (SKILL.md frontmatter — accepted 2026-05-31 +- `go test`, two tiers — (a) presentation-free **unit** in `internal/...` + `pkg/skillcore` (table-driven + ground-truth: fetched tree-SHA == raw `git`; `index` output == committed `index.json`); (b) **`TestQuickstart_*` integration** in `test/` building/exec'ing the real binary. **New network boundary** tested via S4's substrate: `file://` + local bare repo for happy/integrity; the existing `pkg/skillcore/git.go` `commandContext` exec-stub seam (extended to `Clone`/`FetchSparse`) for auth/unreachable/transient. **No `httptest`/go-vcr** (skillrig shells `git`, never calls the GitHub HTTP API — see Constitution Check). +- local files only — vendored subtree `.agents/skills//`, committed lock `.skillrig/skills-lock.json`; origin-side `index.json` (committed in the origin). No DB. **No tool-managed cache** (catalog fetched per `search`). +- the parser `gh` uses; see Complexity Tracking). Lock uses stdlib `encoding/json`. Fetch + tree-SHA via **shelling `git`** (no in-process git/hashing lib). Token via `os.exec` of `git`/`gh` (no `gh`-as-library). diff --git a/docs/ARCHITECTURE-v0.md b/docs/ARCHITECTURE-v0.md index 3d8d9e7..748b905 100644 --- a/docs/ARCHITECTURE-v0.md +++ b/docs/ARCHITECTURE-v0.md @@ -32,8 +32,9 @@ my-org/my-skills (the ORIGIN — created from the skillrig origin template) │ └── / # goreleaser + release-please, same pipeline as skills ├── skills/ │ ├── terraform-plan-review/ -│ │ ├── SKILL.md # agent-facing (vendors to consumer) -│ │ └── skill.toml # machine-facing manifest; [[requires]] may name a cmd/ CLI +│ │ └── SKILL.md # agent-facing + machine-facing manifest in one file: +│ │ # YAML frontmatter carries metadata.x-skillrig.* (version, +│ │ # topics, requires — a `requires` may name a cmd/ CLI). §4.1 │ └── / ├── index.json # generated discovery artifact (committed); carries │ # skillrig-convention version (the contract, §2d) @@ -55,9 +56,10 @@ The generic `skillrig` binary is a single static build (goreleaser, cross-OS/arc | Command | Job | Primary caller | |---|---|---| -| `skillrig search [--tag ...]` | Query `index.json` for skills | human, agent | -| `skillrig add ` | Vendor a skill into this repo + write lock entry | human | +| `skillrig search [QUERY...] [--topic ...]` | Query `index.json` for skills (token-AND over name/description/topics + `--topic` filter) | human, agent | +| `skillrig add [--pin ]` | Vendor a skill into this repo + write lock entry (local-path **or** remote fetch) | human | | `skillrig verify` | Offline: integrity only (label-honesty + orphan), exit code | **CI**, agent, human | +| `skillrig index [--out ]` | **Origin-side**: generate `index.json` from `SKILL.md` frontmatter on merge to `main` | **origin CI** | | `skillrig bump --pr` | Detect upstream advance, open upgrade PR | **CI (cron)** | | `skillrig global add/verify ` | Manage global-scope skills | human | | `skillrig doctor` | Superset health check (integrity + prereqs + auth) | human, agent | @@ -83,7 +85,7 @@ A clarifying decision that *removes* a whole subsystem: **the CLI has no `publis | Telemetry | opt-**out** usage snapshot on sync | **none** — telemetry-free by default | Consequences: -- The CLI's surface is **purely consumer-side**: `search`, `add`, `verify`, `doctor`, `bump --pr`, `global *`, `lint`. Agents and humans share one safe surface; there is no write credential baked into the binary. +- The CLI's surface is **purely consumer-side**: `search`, `add`, `verify`, `doctor`, `bump --pr`, `global *`, `lint`, plus the origin-side `index` generator (CI-run, local-filesystem only — no auth). Agents and humans share one safe surface; there is **no write credential** baked into the binary. The token a private origin needs is a **read-only fetch** credential resolved at runtime via `os.exec` (§8b.2), never a registry write credential. - Even `bump --pr` is GitHub-native — it's a *consumer reconciling to upstream* that happens to open a PR; it uses a scoped CI token, not a registry credential. **Convention:** treat `bump --pr` as CI-invoked; a human's agent session generally shouldn't hold PR-create rights (minor, enforce by token scoping, not by code). - The only "write" the system does to the monorepo is **`index.json` regeneration**, and that's a merge-triggered GitHub Action running `skillrig index`, not a human/agent verb. - **Differentiator to state explicitly:** no registry service, no telemetry, no bespoke auth flow to build or operate — strictly better on N1 (operational surface) than every registry-backed tool studied, and the no-telemetry default matters for the compliance-conscious consumer. @@ -91,7 +93,7 @@ Consequences: This is the single biggest surface reduction in the design: an entire publish/auth/registry subsystem replaced by "use git, where the org already has a permission model." ### `skillrig lint` — author-side quality gate at PR time (borrowed from Skilldex) -Add a `skillrig lint` command (author/CI-facing) that runs in the **monorepo's** PR CI, scoring each changed skill for **format conformance**: parseable frontmatter, `skill.toml` validity, allowed subdirectories, description length/specificity heuristics. Rationale (Skilldex's finding): *undertriggering* — an agent failing to invoke a skill when it should — is a documented failure mode driven by vague descriptions, and the objective parts of conformance are deterministically scorable (N6-safe) even though semantic description quality is not. Because publishing = a PR, this is exactly where quality enforcement belongs in the GitHub-authority model: the lint is a required check on PRs to `my-org/my-skills`. Keep objective checks blocking; keep any semantic-quality score advisory. +Add a `skillrig lint` command (author/CI-facing) that runs in the **monorepo's** PR CI, scoring each changed skill for **format conformance**: parseable `SKILL.md` frontmatter (incl. the `metadata.x-skillrig.*` extensions), allowed subdirectories, description length/specificity heuristics. Rationale (Skilldex's finding): *undertriggering* — an agent failing to invoke a skill when it should — is a documented failure mode driven by vague descriptions, and the objective parts of conformance are deterministically scorable (N6-safe) even though semantic description quality is not. Because publishing = a PR, this is exactly where quality enforcement belongs in the GitHub-authority model: the lint is a required check on PRs to `my-org/my-skills`. Keep objective checks blocking; keep any semantic-quality score advisory. --- @@ -112,7 +114,7 @@ skillrig is a *framework* any org can adopt, not a single-org tool. Three pieces **1. The generic `skillrig` binary (one build for everyone, R4).** Fetched via curl/mise like `gh` or `terraform`. It is **not** compiled per-org and carries **no baked-in origin** — that was the original single-org design (Model A) and it doesn't generalize. Rejected alternative (Model B, "template generates a per-org forked CLI"): every adopting org would run its own goreleaser/release pipeline to ship a binary whose logic is identical to everyone else's, and would be stranded on whatever version they generated. That duplicates operational surface for zero benefit (violates N1) and breaks central updates. So: one generic binary, centrally maintained; the "feels like our CLI" experience comes from it being *pointed at your origin*, not from being a different binary. (A branded shell alias is fine cosmetics; a forked binary is not.) -**2. The origin template (how you stand up a library, R5d).** skillrig.dev provides a **GitHub template repo** — "Use this template → private repo → rename a few strings → dispatch the first release workflow." It ships the origin's structure (`skills/*/SKILL.md` + `skill.toml`, the `index.json` generation workflow, `lint`/CI checks, CODEOWNERS, branch-protection guidance) **and the Go-monorepo backing-CLI pattern** (`cmd/` building the org's private CLIs via goreleaser + release-please, alongside `skills/`). Standing up a library needs the git host's native features only — no CLI, no service. *(v0 ships one batteries-included template; vNext can add a minimal skills-only variant and other bootstrap paths.)* +**2. The origin template (how you stand up a library, R5d).** skillrig.dev provides a **GitHub template repo** — "Use this template → private repo → rename a few strings → dispatch the first release workflow." It ships the origin's structure (`skills/*/SKILL.md` with skillrig metadata in its frontmatter — no sibling `skill.toml` as of 003, the `index.json` generation workflow that runs `skillrig index` on merge to `main`, `lint`/CI checks, CODEOWNERS, branch-protection guidance) **and the Go-monorepo backing-CLI pattern** (`cmd/` building the org's private CLIs via goreleaser + release-please, alongside `skills/`). Standing up a library needs the git host's native features only — no CLI, no service. *(v0 ships one batteries-included template; vNext can add a minimal skills-only variant and other bootstrap paths.)* **3. The origin as a versioned contract (R5e) — the key insight.** The origin's conventions *are the contract* the generic binary depends on: skill layout, index generation, tag/version scheme, the tree-SHA boundary. The template is the **reference implementation** of that contract. So the contract carries a **convention version** (e.g. a `skillrig-convention: 1` field in the origin's index/config); the binary checks it and fails clearly against an incompatible origin rather than silently misbehaving — the way Terraform pins provider schema versions. This lets the contract evolve deliberately in vNext without breaking deployed consumers, and makes "validating/evolving the origin conventions" a first-class, versioned activity rather than ad-hoc drift. @@ -158,31 +160,39 @@ Realizes R6–R8. This is the most important architectural idea and the place mo ## 4. Lockfile & manifest schemas -### 4.1 `skill.toml` (per-skill manifest — vendors with the skill, R16, R25) - -```toml -name = "terraform-plan-review" -version = "1.4.0" -namespace = "my-org" # reverse-DNS-ish namespacing -description = "Review a terraform plan for risk and drift." - -# Deterministic discovery tags (R24). Suggestion UX is v1; the DATA ships now. -tags = ["platform-team", "terraform", "aws"] - -# Backing-CLI prerequisites (R15). Declared, not installed (R17). -[[requires]] -tool = "oxid" -version = ">=0.4.0" -source = "cdktn-io/oxid" # private repo; mise gh backend -manager = "mise" # supported path; advisory - -[[requires]] -tool = "terraform" -version = ">=1.6" -source = "hashicorp/terraform" -manager = "mise" +### 4.1 `SKILL.md` frontmatter (per-skill manifest — vendors with the skill, R16, R25) + +**Changed in 003 (S1):** the per-skill manifest is the skill's own **`SKILL.md` YAML frontmatter**, not a sibling `skill.toml`. This adopts the [agentskills.io](https://agentskills.io) standard — standard fields (`name`, `description`, `license`, `compatibility`) verbatim, and skillrig-specific data under the standard's free-form `metadata` map, namespaced under `x-skillrig.*`. This is the field-source the catalog (`index`) is generated **from**, so the manifest format *is* the catalog's source of truth. **Notation note:** the `[[requires]]` TOML array-of-tables seen elsewhere in this document is **historical** — the backing-CLI declaration now lives as **`metadata.x-skillrig.requires`** (a YAML list of `{tool, version, source, manager}`) in the frontmatter; read `[[requires]]` as that throughout. `go-toml/v2` is retained only for `.skillrig/config.toml` / `.skillrig-origin.toml`, never for skill manifests. + +```yaml +--- +name: terraform-plan-review +description: Review a terraform plan for risk and drift. +license: MIT +metadata: + x-skillrig: + namespace: my-org # reverse-DNS-ish namespacing + version: "1.4.0" + convention-version: "1" # the origin-contract version (gated, §2d) + # Deterministic discovery topics (R24, renamed from "tags"). Suggestion UX is v1; the DATA ships now. + topics: [platform-team, terraform, aws] + # Backing-CLI prerequisites (R15). Declared, not installed (R17). + requires: + - tool: oxid + version: ">=0.4.0" + source: cdktn-io/oxid # private repo; mise gh backend + manager: mise # supported path; advisory + - tool: terraform + version: ">=1.6" + source: hashicorp/terraform + manager: mise +--- ``` +**Why frontmatter, and the `yaml.v3` divergence (S1, accepted 2026-05-31).** The original "sibling TOML file" reasons (two audiences, travels-with-skill, TOML cosmetics) didn't survive scrutiny: frontmatter is more atomic (one file, no name/description drift between two files), and the standard's `metadata` map exists *specifically* for ecosystem extensions, so aligning preserves portability across the 26+ agentskills.io-compliant clients. The cost is a new dependency — **`gopkg.in/yaml.v3`** — because frontmatter *is* YAML; this is the same parser the Go `gh` CLI uses in production for exactly this (flat dotted-prefix keys like `metadata.x-skillrig.version`), so it is a deliberate, validated divergence from the project's earlier "no new dependencies" / "TOML config" stance, not an accident. `.skillrig/config.toml` (origin config) stays TOML; only the per-skill manifest moves to YAML frontmatter. + +> **Correction to an early hypothesis:** the standard's `allowed-tools` field does **not** carry `requires` — it is a space-separated string of agent-permission tool invocations (gh rejects an array form). So `version`, `topics`, `namespace`, `convention-version`, **and** `requires` all live under `metadata.x-skillrig.*`. `x-skillrig.requires` is a nested list (bending the spec's string→string letter), parsed via `interface{}` the way gh does. + ### 4.2 `.skillrig/skills-lock.json` (project scope — committed, R9–R11) ```jsonc @@ -277,7 +287,7 @@ A skill's required CLI is one of two kinds, and skillrig handles both: - **Public CLIs — delegated to mise.** For widely-used tools (`terraform`, `gh`, …), provisioning is the developer's choice; **mise** is the supported/recommended path (`mise.toml` + shell hook = good onboarding). skillrig never becomes a binary package manager (R17); it declares and verifies, mise installs. Mechanics: -- **Declare:** `[[requires]]` in `skill.toml` (§4.1), vendored so checks run offline (R16). A `source` of the org's own origin signals a private, co-located CLI; an external source signals a public one. +- **Declare:** `metadata.x-skillrig.requires` in the skill's `SKILL.md` frontmatter (§4.1), vendored so checks run offline (R16). A `source` of the org's own origin signals a private, co-located CLI; an external source signals a public one. - **Check (doctor, not verify):** `skillrig doctor` checks each `requires` entry — on PATH (or resolvable via mise)? version satisfies constraint? — deterministic pass/fail with exit code (R11, R17). A `mise.toml` in the consumer repo is a **suggestion, not a requirement**: skillrig works without it and simply reports the binary missing from PATH; when a `mise.toml` *is* present, "resolvable via mise" counts as satisfied. This is **doctor's** job, not `verify`'s — `verify` validates vendored *content* (label-honesty + orphan) and needs no backing binaries; prerequisite/eligibility is what the *runtime agent* needs (clarified 2026-05-29, §2). - **Auth as a distinct failure (R18):** for private-repo tools (mise gh backend pulling from the origin or e.g. `cdktn-io/oxid`), `doctor` must distinguish "tool missing" from "tool exists but you can't authenticate to fetch it" — explicitly check `gh auth` / `GITHUB_TOKEN` reachability and report it as its own actionable error. The most common onboarding/CI footgun; surface it loudly. @@ -292,7 +302,9 @@ Verified against mise's GitHub backend docs (current as of early 2026). Three fi - Because mise's `github:org/repo` shorthand tracks a repo's *latest release*, a monorepo with interleaved tag streams needs **per-tool tag filtering** (mise's version/tag-regex options) so each backing CLI tracks only its own prefix (`oxid-v*`). This is fiddly enough that **the template should generate the correct `mise.toml` stanza per backing CLI** — a concrete, valuable template job, not something each adopting org should reverse-engineer. - Rejected alternative: **separate repo per CLI** (`github:my-org/oxid`) is cleanest for mise (each tracks its own latest, no tag-regex), but it **breaks the co-location** that lets a skill and its CLI ship together — so it fights the design. Monorepo-with-tag-streams is the chosen path; see open question on exactly how the template stamps `tag_regex`. -**(2) Auth resolution is well-defined and matches R18.** Token resolution order: `github.credential_command` (highest) → `MISE_GITHUB_TOKEN` env → `github_tokens.toml` → gh CLI `hosts.yml` → git credential fill. The **`credential_command`** hook (runs a shell command per host to fetch a token — 1Password/Vault/secret-manager friendly) is the clean **enterprise auth path**, and it's host-aware for GitHub Enterprise. In CI, `jdx/mise-action` uses `${{ github.token }}` by default, so private-repo fetches in the org's own Actions need no extra plumbing. This is the same git/token auth skillrig already relies on (R5b) — no new credential surface. +**(2) Auth resolution is well-defined and matches R18.** mise resolves a GitHub token from (in order) the **env vars first** (`MISE_GITHUB_TOKEN` / `GITHUB_TOKEN`), then `github.credential_command`, then `github_tokens.toml`, then gh CLI `hosts.yml`, then git credential fill. The **`credential_command`** hook (runs a shell command per host to fetch a token — 1Password/Vault/secret-manager friendly) is the clean **enterprise auth path**, and it's host-aware for GitHub Enterprise. In CI, `jdx/mise-action` uses `${{ github.token }}` by default, so private-repo fetches in the org's own Actions need no extra plumbing. This is the same git/token auth skillrig already relies on (R5b) — no new credential surface. + +> **Correction (S3, 2026-05-31).** An earlier draft of this section claimed mise's precedence put `github.credential_command` *highest* (above the env vars). That is inaccurate — mise reads the env vars first. This does **not** affect skillrig, which never uses mise's `credential_command` and resolves its own fetch token directly via `os.exec`: `GH_TOKEN` env → `GITHUB_TOKEN` env → `gh auth token --hostname ` (exit 0 + non-empty = token; non-zero or `gh`-absent = skip, not fatal). skillrig injects that token per fetch via `git -c http.extraHeader="Authorization: Basic "` — **never** in the clone URL (avoids history/process-listing leakage). `git credential fill` is deferred (the `gh` path already covers the keyring tokens mise reads from `hosts.yml`); GitHub Enterprise is deferred behind the `hostname` seam of `ResolveGitHubToken(hostname string)`. **(3) mise can verify what it pulls.** mise supports optional **SLSA provenance verification** and **GPG asset verification** for github-backend tools. This complements skillrig's own posture: skillrig verifies the *skill* (tree SHA + approval), mise can verify the *backing binary* (SLSA/GPG) — layered supply-chain integrity, neither owning the other's job. @@ -304,10 +316,23 @@ Verified against mise's GitHub backend docs (current as of early 2026). Three fi ## 9. Discovery — generated `index.json` (R22–R24) -- On release, `internal/index` walks `skills/*/skill.toml` and emits a committed `index.json` at repo root: name, version, description, tags, requires-summary, path. -- `skillrig search [--tag platform-team]` reads `index.json` — deterministic filtering on tags (R24, N6). No standing infrastructure (R23, N1). +- **On merge to `main`** (not "on release" — corrected in 003/S2; the origin's `index.yml` workflow is `push: main` paths `skills/**`), **`skillrig index`** walks `skills/*/SKILL.md`, parses each via the **same** `skillcore.ParseManifest` the consumer commands use (AP-04), and emits a committed `index.json` at repo root: per-skill `name`, `version`, `namespace`, `description`, `topics[]`, `path` (+ a `requires` summary); catalog-level `skillrigConvention` + `origin`. The producer's output **must equal** the committed `index.json` — a ground-truth contract test, mirroring the tree-SHA oracle. +- **Single-tip, full-regenerate (D-S2).** `index.json = f(HEAD frontmatter)` — it reflects only the branch tip (one version per skill = the HEAD/latest-released version), is fully regenerated each run, and accumulates nothing (GC is YAGNI). It is **never** a version-history index: a skill's prior versions live in git tags, reached by `add --pin ` fetching the tag subtree directly. A skill removed at HEAD correctly drops from `search` while already-vendored consumers stay fine (their lock is offline-verifiable). Cross-ref tag aggregation was considered and rejected (it would need unbounded tag-walking and break the single root `skillrigConvention`). +- **Why `index` ships in-repo (not a roadmap deferral).** `search` is meaningless against a hand-maintained catalog that drifts — the shipped `build-index.sh` *provably* drifts (it emits only `name/version/description/path`, dropping `topics`/`requires`, which is FR-023). Since skillrig is the single tool for origin maintenance too and the hard part (the frontmatter parser) is already built by the consumer side, `index` is a thin walk+marshal reusing `ParseManifest` — AP-04 by construction. It is consume-only in the credential sense (no auth, local-FS only). +- `skillrig search terraform --topic platform-team` reads `index.json` — query-first token-AND over `name`+`description`+`topics`, deterministic order, exact `--topic` filter (R24, N6; `topics` renamed from `tags`). No standing infrastructure (R23, N1). - **GH Pages is dropped from v0.** It added a second system for what `index.json` already provides; for a private repo it would also require Enterprise + private-Pages config purely for browse convenience. If a human browse UI is wanted later (D3), point mkdocs + a client-side search index (pagefind/lunr) at the *same* `index.json` — one source of truth, Pages becomes optional sugar. +### 9a. 003 co-evolution / FR-023–FR-024 doc-sync tracker + +003 touches two repos + the doc set; recorded here so nothing silently drifts: + +- **FR-023 — origin template (`skillrig/origin-template`):** reconcile `scripts/build-index.sh` and the committed `index.json` so the catalog carries every field `search` consumes (notably `topics`); fold each `skills/*/skill.toml` into its `SKILL.md` frontmatter and delete the sibling file; record the schema as the convention-1 catalog contract; point `index.yml` at `skillrig index`. **This in-repo doc-sync does *not* perform that separate `skillrig-origin` migration** — it remains the open FR-023 item. +- **FR-024 — these docs:** the 003+004 merge, the local-vs-remote reframe, `SKILL.md` frontmatter + `yaml.v3`, "catalog on **merge** not on release", and the mise token-precedence correction are recorded above and in `docs/ROADMAP.md`; the CLI behavior (search Query + remote `add` Vendor Mutation + `index` generator, `--pin`/`--topic`, the new error classes) is recorded in `docs/design/cli.md`. +- **Constitution touch-ups (C14 — one pass, team-approved; the constitution is *not* edited by this doc-sync, only the list is recorded for the amendment):** + 1. **§III, the ground-truth bullet that reads "generated from a real `skills/*/skill.toml` walk":** `skill.toml` → **`SKILL.md` frontmatter** (the index source moved in S1). + 2. **§III, the testing-tiers "Unit" bullet ("`httptest` + go-vcr cassettes for the GitHub path"):** skillrig shells `git` and never calls the GitHub HTTP API, so the faithful seam is the **exec-stub** (the existing `pkg/skillcore/git.go` `commandContext`/`TestHelperProcess` re-exec, extended to `Clone`/`FetchSparse`), **not** `httptest`/go-vcr. The happy/integrity path uses `file://` + a local bare repo. + 3. **§IX — the stale eval-tooling path:** the eval runner is `.agents/skills/skill-creator/scripts/run_eval.py`, not the `scripts/run_eval.py` the constitution currently cites. + --- ## 9b. External sources, allowlist & audit (R26–R29) — v1+ governance, designed-for in v0 @@ -367,7 +392,7 @@ Each entry: what it is, what it solves for our contract, where it fails, and the **✅ ToolHive (`thv`, Go) — REFRAMED.** *Not* "the closest tool" as first grounded — it's an **MCP-server governance platform** (Registry + Runtime-on-Kubernetes + Gateway + Portal) that recently grew a skills feature. **Structural disqualifier:** daemon-based — the ToolHive API server must be running for every `thv skill` command (a separate `thv serve` process or the desktop app), which conflicts with N1 (no long-running service) and R4 (just a binary), and is especially bad for the CI/agent callers who'd have to stand up the daemon first. **Solves:** scope split (R6/7); multi-client auto-placement (R19–21); the `git://host/owner/repo[@ref][#path]` grammar (confirms R26 is an ecosystem convention); and notably **real content-addressed integrity via OCI digests** (identifier + digest + media type) — the integrity guarantee `gh skill`/Vercel lack. **Fails:** N1/R4 (daemon); no backing-CLI prereq concept (R15–R18); OCI/registry machinery is heavier than vendored-git wants. **Nuggets:** (1) OCI digests *validate that content-addressed identity is useful* — we achieve a sufficient git-native equivalent via the git tree SHA (§4.2), avoiding the registry/daemon dependency; (2) its **per-catalog access-control + audit-trail** model is a maturity reference for our v1 governance (R27–R29). -**✅ OpenClaw `openclaw skills` + dependency system.** The **only** ecosystem studied that solves our backing-CLI requirement. **Solves (uniquely):** skills *declare external dependencies* (e.g. `gh`, `curl`); a dependency-install flow (`installSkillDependency`) and — most relevant — `openclaw skills list --eligible` filters to skills *actually runnable in the current environment*, with success defined to include "no missing dependency or auth error." That `--eligible` semantic is precisely our `skillrig verify` prereq check incl. the auth case (R15–R18). **Also solves:** strict realpath-containment on skill discovery (only roots whose resolved realpath stays inside the configured root) — a concrete defense for our orphan/symlink-escape vector (R28); a built-in dangerous-code scanner that blocks installs on critical findings unless overridden; per-agent allowlists that *replace*, not merge with, defaults (R27). **Fails our contract:** TS/Node, not a single binary (R4); registry-backed install flow assumes ClawHub connectivity for parts. **Caveat:** public registry has had security incidents — reinforces our private-first + allowlist posture. **Nugget:** study its dependency-declaration schema + `--eligible` as the reference design for `skill.toml [[requires]]` + `verify` (§4.1, §8). +**✅ OpenClaw `openclaw skills` + dependency system.** The **only** ecosystem studied that solves our backing-CLI requirement. **Solves (uniquely):** skills *declare external dependencies* (e.g. `gh`, `curl`); a dependency-install flow (`installSkillDependency`) and — most relevant — `openclaw skills list --eligible` filters to skills *actually runnable in the current environment*, with success defined to include "no missing dependency or auth error." That `--eligible` semantic is precisely our `skillrig verify` prereq check incl. the auth case (R15–R18). **Also solves:** strict realpath-containment on skill discovery (only roots whose resolved realpath stays inside the configured root) — a concrete defense for our orphan/symlink-escape vector (R28); a built-in dangerous-code scanner that blocks installs on critical findings unless overridden; per-agent allowlists that *replace*, not merge with, defaults (R27). **Fails our contract:** TS/Node, not a single binary (R4); registry-backed install flow assumes ClawHub connectivity for parts. **Caveat:** public registry has had security incidents — reinforces our private-first + allowlist posture. **Nugget:** study its dependency-declaration schema + `--eligible` as the reference design for the skill's `requires` declaration (`metadata.x-skillrig.requires` in `SKILL.md` frontmatter — §4.1) + `verify`/`doctor` (§8). **✅ OpenClaw / ClawHub split (the Unix-philosophy question).** Confirmed: native `openclaw skills` handles day-to-day discover/install/update (embedded in the agent), while a **standalone `clawhub` CLI** owns registry-authenticated publish/sync/version/CI workflows — both hitting the same registry data. This is exactly the single-responsibility split worth mirroring: a consumption surface vs. a dedicated publish/CI binary. Validates our "root CLI + per-skill/publish CLIs off `cmd/`" instinct. *(Deeper ClawHub research pending — auth model, sync semantics, CI ergonomics.)* @@ -406,7 +431,7 @@ Recommendation to pressure-test: **Option B for the core, borrow `gh skill`'s cl 8. **Allowlist authoring location** — `allowlist` block inside the `index.json` build inputs vs. a standalone `policy.toml`; and whether the allowlist is global-only or also per-consumer-repo (a repo tightening the org default). 9. **Risk-signal provider interface** — what the v1 advisory score source is (Snyk via the public registry? other?), and how `doctor` degrades to silent when offline. 10. **Lockfile write atomicity** (from Skilldex caution) — write to temp file + rename; consider file locking for the CI-bump-vs-human-edit race on the same lock. -11. **Skillset / bundle grouping** (from Skilldex) — should `skill.toml` support grouping a skill with its backing CLI + shared assets (vocab/templates/reference docs) as one coherently-versioned, co-vendored unit? Relevant to skills whose behavior depends on shared context. Likely v1. +11. **Skillset / bundle grouping** (from Skilldex) — should the skill **manifest** (`SKILL.md` frontmatter) support grouping a skill with its backing CLI + shared assets (vocab/templates/reference docs) as one coherently-versioned, co-vendored unit? Relevant to skills whose behavior depends on shared context. Likely v1. 12. **Third scope tier?** (from Skilldex's global/shared/project) — is a "shared" tier (team-wide, multi-repo, not user-global) needed, or do project + global suffice for v0? (Leaning: two tiers for v0.) 13. **`bump --pr` invocation policy** — enforce CI-only by token scoping so human agent sessions don't carry PR-create rights. 14. **Origin convention versioning** (§2d, R5e) — where the `skillrig-convention` version lives (in `index.json`? a top-level `.skillrig-origin.toml`?), the compatibility policy (binary supports conventions N and N-1?), and how the template's release workflow stamps it. The contract surface is small now; pin it before the first external org adopts. @@ -420,14 +445,15 @@ Consolidates the phasing scattered through the doc. **v0** is the minimum cohere ### v0 — the minimum coherent framework The smallest thing that delivers the core promise ("the skill your agent runs is the version you approved") end-to-end. -- Generic `skillrig` binary; consumer command surface: `search`, `add`, `verify`, `doctor`, `bump --pr`, `global *`, `lint`, `init` (§2). +- Generic `skillrig` binary; consumer command surface: `search`, `add`, `verify`, `doctor`, `bump --pr`, `global *`, `lint`, `init` + the origin-side `index` generator (§2). **`search` + remote `add` + `index` ship as one merged slice (003), not the original 003/004 split** — they share the remote-fetch layer and the catalog (`search` is useless without a generator that doesn't drift). The same slice migrates the manifest to `SKILL.md` frontmatter (§4.1). +- **Local-vs-remote origin (003).** `add`/`search` classify the origin: a path-shaped origin operates on a local checkout (002 behavior, generalized to a real path), a bare `OWNER/REPO` is fetched remotely (sparse `git` clone). No "both present" precedence; no tool-managed cache of a remote origin. - Two scopes: project (vendored, verify-only) + global (fetch/restore) (§3). **Two tiers only** — no "shared" middle tier. -- Lockfile with `commit` (provenance) + `treeSha` (label honesty) + `requires` (§4.2); `.skillrig/config.toml` (input) + `.skillrig/skills-lock.json` (output) (§2d). -- Origin discovery via env > project config > global default; origin = git, **no auth of its own** (§2d). -- One **batteries-included GitHub template** (skills + Go-monorepo backing-CLI pattern + index/lint/release workflows) (§2d). -- Backing-CLI declare + **doctor**-side eligibility check (`[[requires]]`, `--eligible`-style readiness, auth-as-distinct-failure) (§8) — *not* in `verify`, which is integrity-only; mise consumption via **per-CLI tagged releases + template-generated `mise.toml`** (§8b). +- Lockfile with `commit` (provenance) + `treeSha` (label honesty) + resolved `version`/tag + `requires`-summary (§4.2); `.skillrig/config.toml` (input) + `.skillrig/skills-lock.json` (output) (§2d). The per-skill **manifest is `SKILL.md` frontmatter** (`skill.toml` dropped, §4.1). +- Origin discovery via env > project config > global default; the origin contract has **no auth of its own**, but a private origin is fetched with a **read-only token** resolved via `os.exec` (`GH_TOKEN` > `GITHUB_TOKEN` > `gh auth token`) (§2d, §8b.2). +- One **batteries-included GitHub template** (skills + Go-monorepo backing-CLI pattern + index/lint/release workflows; `index.yml` runs `skillrig index` on merge) (§2d). +- Backing-CLI declare + **doctor**-side eligibility check (`requires`, `--eligible`-style readiness, auth-as-distinct-failure) (§8) — *not* in `verify`, which is integrity-only; mise consumption via **per-CLI tagged releases + template-generated `mise.toml`** (§8b). - Multi-client materialization: canonical `.agents/skills` + symlink views, copy-fallback (§6). -- Discovery via committed `index.json` (§9); **deterministic tags ship in the manifest** (data only). +- Discovery via committed `index.json`, **generated by `skillrig index` on merge to `main`** (single-tip, full-regenerate — §9); **deterministic `topics` ship in the manifest** (data only; renamed from `tags`). - Drift-aware **three-way-merge bump** with conflict-markers-and-error (§5b). - `lint` as a required PR check on the origin (§2b). - Orphan protection effectively free at the `verify` gate (on-disk set must equal locked set) (§9b). diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index 6e2b75e..dc365a7 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -28,10 +28,9 @@ resolver — AP-04 / AP-06) and layers thin commands on top. | # | Feature branch | Pattern | Depends on | Status | |---|----------------|---------|------------|--------| -| 001 | **`init` + origin resolution** — `env SKILLRIG_ORIGIN > .skillrig/config.toml > ~/.config/skillrig/config.toml`; `skillrig init [--origin] [--global]` binds an existing origin (never bootstraps) | Environment | — (project skeleton) | 🚧 | -| 002 | **`skillcore` + `add` (local) + `verify`** — git tree-SHA + `skill.toml` parse; **local-origin** `add` (vendor subtree + lock; `--dry-run`/`--force`); offline label-honesty + orphan check; **exit codes 0/1/2** (exit 3 → `doctor`/005) | Vendor Mutation + Verification Gate | 001 | ⬜ | -| 003 | **`search`** — read origin (branch aware) committed `index.json`, deterministic tag filter, Two-Level Output | Query | 001 | ⬜ | -| 004 | **`add` — remote origin fetch** — network fetch from a GitHub-hosted origin (partial-clone + sparse-checkout) + auth; `@ref`/`--pin` immutable pins (local-origin `add` already shipped in 002) | Vendor Mutation | 002 | ⬜ | +| 001 | **`init` + origin resolution** — `env SKILLRIG_ORIGIN > .skillrig/config.toml > ~/.config/skillrig/config.toml`; `skillrig init [--origin] [--global]` binds an existing origin (never bootstraps) | Environment | — (project skeleton) | ✅ | +| 002 | **`skillcore` + `add` (local) + `verify`** — git tree-SHA + manifest parse; **local-origin** `add` (vendor subtree + lock; `--dry-run`/`--force`); offline label-honesty + orphan check; **exit codes 0/1/2** (exit 3 → `doctor`/005) | Vendor Mutation + Verification Gate | 001 | ✅ | +| 003 | **Discover & Acquire — `search` + remote `add` + `index`** (the 003+004 merge — see note below) — `search` reads the origin's committed `index.json` (query-first token-AND over name/description/topics + `--topic` filter, deterministic order, Two-Level Output); **remote `add`** fetches the skill subtree from a GitHub-hosted origin (sparse `git` clone + token via `os.exec`) with `--pin ` immutable pins (local-origin `add` shipped in 002 — `add` now serves both forms); **`index`** is the origin-side generator that produces the catalog from each skill's `SKILL.md` frontmatter. Commit 1 migrates the manifest from `skill.toml` → `SKILL.md` frontmatter (`gopkg.in/yaml.v3`). | Query + Vendor Mutation + origin-side generator | 002 | 🚧 | | 005 | **backing-CLI prereqs** — `[[requires]]` declare + verify (`--eligible`-style readiness, auth-as-distinct-failure R18); mise consumption via per-CLI tagged releases + template-generated `mise.toml` | (extends verify/doctor) | 002 | ⬜ | | 006 | **`doctor`** — superset health check (integrity + prereqs + auth) | Environment | 002, 005 | ⬜ | | 007 | **`bump --pr`** — detect upstream advance, drift-aware three-way-merge, open reviewable PR (conflict markers + non-zero exit on conflict) | Vendor Mutation | 002, 004 | ⬜ | @@ -41,12 +40,17 @@ resolver — AP-04 / AP-06) and layers thin commands on top. | 011 | **`skills.sh`** — support Vercel's skill.sh hosted skills. External skill adoption workflow (federated skill registries, whitelisted in origin, origin policy provisions for approval/review (skills.sh are evaluated on their usage statistics and audit reports, they should vetted or flagged with warnings)) | Evolution | 002 | ⬜ | | 012 | **`aws`** — ENTERPRISE - support Private AWS AgentRegistry hosted skills | Evolution | 002 | ⬜ | +> **003 + 004 merged into one slice (decided 2026-05-31, FR-024).** The original roadmap split `search` (003) from remote-`add` fetch (004). They ship together because they share the same new machinery — the remote-fetch layer, the auth/token resolver, and (critically) the **catalog**: `search` is meaningless against a catalog that nothing generates, and the shipped `build-index.sh` provably drifts (it drops `topics`/`requires`). So the merged slice also pulls in the origin-side **`index`** generator (reversing the original "consume-only, roadmap a generator" lean — `index` reuses the consumer's `ParseManifest`, so it's a thin walk+marshal, AP-04 by construction) and the **manifest migration** to `SKILL.md` frontmatter. The slice keeps the branch id **003**; there is no separate 004 — the numbers below (005–012) are unchanged. +> +> **Local-vs-remote reframe.** 002's `add` overloaded `OWNER/REPO` as a *local directory* (`/OWNER/REPO`) and ran `git -C` on that working tree. 003 splits the two origin forms cleanly: a **path-shaped** origin operates on a local checkout (002 behavior, generalized to a real filesystem path), while a bare **`OWNER/REPO`** is fetched remotely. There is no "both present" precedence — the tool never creates or caches a local copy of a remote origin. + **Cross-cutting v0 commitments** (architecture §13): - Two scopes only — project (vendored, verify-only) + global (fetch/restore). **No "shared" middle tier.** -- Lockfile carries `commit` (provenance) + `treeSha` (label honesty); the per-skill **manifest** (not the lock) carries `[[requires]]` — the vendored manifest is the single source of truth (002 D4; reconciles §4.2). `.skillrig/config.toml` (input) split from `.skillrig/skills-lock.json` (output) (§2d). -- Origin = git; **no auth of its own** (§2d). +- The per-skill **manifest is `SKILL.md` frontmatter** (agentskills.io standard fields + skillrig extensions under `metadata.x-skillrig.*`; `skill.toml` is dropped as of 003). The frontmatter is the single field-source for both the catalog (`index`) and `add`/`verify` (§4.1). +- Lockfile carries `commit` (provenance) + `treeSha` (label honesty) + the resolved human-readable `version`/tag; the per-skill **manifest** (not the lock) carries `requires` — the vendored manifest is the single source of truth (002 D4; reconciles §4.2). `.skillrig/config.toml` (input) split from `.skillrig/skills-lock.json` (output) (§2d). +- Origin = git; **no auth of its own** for the origin contract — a private origin is fetched with a **read-only** token resolved via `os.exec` (`GH_TOKEN` > `GITHUB_TOKEN` > `gh auth token`); still no write credential in the binary (§2d, §8b.2). - One **batteries-included GitHub template** (skills + Go-monorepo backing-CLI pattern + index/lint/release workflows) (§2d). -- Discovery via committed `index.json`; **deterministic tags ship in the manifest** (data only) (§9). +- Discovery via committed `index.json`, **generated by `skillrig index` on merge to `main`** (single-tip, full-regenerate — not "on release"); **deterministic `topics` ship in the manifest** (data only; `topics` renamed from `tags`) (§9). - Orphan protection effectively free at the `verify` gate — on-disk set must equal locked set (§9b). - Supply-chain posture: recommend immutable releases + tag protection on the origin (§9b). @@ -81,7 +85,7 @@ Each is justified only if its trigger fires; recorded here so they aren't silent ## Explicitly *not* built (maps to requirements §5 / architecture §10) -- **Team→skill suggestion engine (D1):** tags ship now (R24); suggestion UX is v1, additive, deterministic-only. +- **Team→skill suggestion engine (D1):** `topics` ship now (R24); suggestion UX is v1, additive, deterministic-only. - **Onboarding wizard (D2):** docs + PR template only. - **Browse UI (D3):** deferred to v1 over the same `index.json`. - **Client tiers (D4):** single-track v0; a deployment concern if it ever matters. diff --git a/docs/design/cli.md b/docs/design/cli.md index 0971592..c1e1cd8 100644 --- a/docs/design/cli.md +++ b/docs/design/cli.md @@ -39,9 +39,10 @@ $ skillrig skillrig — rig up your agents with skills (git-native skill distribution) Commands: - search Query the origin's index.json for skills + search Query the origin's index.json for skills [implemented] add Vendor a skill into this repo + write the lock entry [implemented] verify Offline integrity check — label-honesty (exit code; CI gate) [implemented] + index Generate the origin's index.json from skill frontmatter [implemented, origin-side] bump Detect upstream advance, open an upgrade PR global Manage global-scope skills (fetch/restore) doctor Superset health check (integrity + prereqs + auth) @@ -73,14 +74,15 @@ fix: skillrig add (e.g. skillrig add terraform-plan-review); run skillri # `skillrig add --help` then reveals the full (shipped) surface: Usage: - skillrig add [--dry-run] [--force] [--json] [--verbose] + skillrig add [--pin ] [--dry-run] [--force] [--json] [--verbose] Examples: skillrig add terraform-plan-review + skillrig add terraform-plan-review --pin v1.4.0 skillrig add terraform-plan-review --dry-run ``` -> The origin is **resolved**, never passed to `add` (no `--from`/`--origin` arg — clarified 2026-05-30); immutable per-skill `--pin ` is **deferred** (Out of Scope this slice). The synopsis above is the shipped surface. +> The origin is **resolved**, never passed to `add` (no `--from`/`--origin` arg — clarified 2026-05-30). `add` now serves both a **local-path** origin (002) and a **remote** `OWNER/REPO` origin it fetches over `git` (003); the immutable per-skill `--pin ` shipped with the remote path. The synopsis above is the shipped surface. See [Pin vs. branch ref](#origin-reference-grammar) for the `--pin` semantics. Progressive disclosure: **overview (injected) → usage (explored) → parameters (drilled down).** The agent discovers on-demand, each level providing just enough information for the next step. @@ -108,6 +110,38 @@ add failed: cannot reach origin 'my-org/my-skills' (git: repository not found). → If the repo is private, this is usually auth — see 'gh auth status' / GITHUB_TOKEN. ``` +**The remote-fetch failure classes are distinct typed errors (R17/R18, FR-016–019).** When `add`/`search` fetch a remote origin, three confusable failures must each map to their own error class so the agent debugs the right thing — never collapse them. They are classified inside `pkg/skillcore` from the raw `git` stderr (`AuthError` / `UnreachableError` / `NotFoundError`), then rendered with what/why/fix by `internal/cli`: + +``` +# AuthError — private origin, no/invalid credentials (git stderr: "Authentication failed" / "Invalid username or token") +add failed: authentication to origin 'my-org/my-skills' failed (git: Authentication failed). +→ Authenticate: 'gh auth login', or export a GITHUB_TOKEN / GH_TOKEN with read access to the origin. +→ This is AUTH, not a missing/typo'd repo — the origin name resolved fine. + +# UnreachableError — network failure / wrong host (git stderr: "Could not resolve host" / "Failed to connect") +search failed: origin 'my-org/my-skills' is unreachable (git: Could not resolve host github.com). +→ Check connectivity / proxy / the host in the origin reference; retry. +→ This is a network problem, not auth or a missing repo. + +# NotFoundError — origin (or skill) does not exist. PRIVATE-REPO SUBTLETY: GitHub returns +# "not found" (not 403) for a private repo with no/bad token, so the fix names the auth path too. +add failed: origin 'my-org/my-skills' not found (git: repository not found). +→ Check the origin spelling: 'skillrig init --origin ' or 'cat .skillrig/config.toml'. +→ If this is a PRIVATE origin, authenticate via 'gh auth login' or set GITHUB_TOKEN — GitHub reports a private repo you can't see as "not found". +``` + +Keep the **convention-version** failure distinct from all three above — it means the origin is reachable and authenticated but speaks a layout this binary doesn't support: + +``` +# IncompatibleConventionError — origin's convention_version is not the supported value (exact-match == 1) +add failed: origin 'my-org/my-skills' uses convention version 2, but this skillrig supports 1. +→ Update skillrig to a build that supports this origin's convention, or point at a compatible origin. + +# NoSuchVersionError — a '--pin' that resolves to no git ref/tag (distinct from "skill not found") +add failed: no version 'v9.9.9' of 'terraform-plan-review' in origin 'my-org/my-skills' (no such tag). +→ List the published versions on the origin, or omit '--pin' to vendor the origin-branch tip. +``` + ``` # Bad: missing origin, no next step Error: origin not configured @@ -174,14 +208,16 @@ Errors are also a signal about how agents *want* to use the CLI. The concept of Human output is designed for quick scanning — truncated previews, counts instead of nested data, footer hints for next steps. ``` -$ skillrig search --tag terraform +$ skillrig search terraform --topic aws terraform-plan-review | v1.4.0 | my-org | Review a terraform plan for risk and drift. | requires: oxid, terraform terraform-cost | v0.9.0 | my-org | Estimate the cost delta of a plan. | requires: terraform -2 skill(s) match tag 'terraform' -→ Use 'skillrig add ' to vendor one, or 'skillrig search --json' for the full manifest +2 skill(s) match 'terraform' (topic: aws) +→ Use 'skillrig add ' to vendor one, or 'skillrig search --json' for the full catalog entry ``` +`search` is **query-first**: `search [QUERY...]` is a case-insensitive token-AND substring match over `name` + `description` + `topics`, with `--topic` as a separate exact-string filter (repeatable). The result order is deterministic (N6) — a fixed relevance bucket (exact-name > name > topic > description match) then lexicographic by name. There is **no** fuzzy / semantic / TF-IDF ranking. An empty result is a **clean exit 0** (not an error) with a footer hint. + **Truncation rules** (human output only): - Description: first 80 chars, append `...` if truncated - Newlines replaced with spaces @@ -191,12 +227,12 @@ terraform-cost | v0.9.0 | my-org | Estimate the cost delta of a plan. ### JSON Output (`--json`): Complete and Pipeable -JSON output includes full, untruncated data — the entire `skill.toml` manifest, full `requires` constraints, full lock entries. The consumer (agent or `jq` pipe) decides what to extract. No truncation, no previews. +JSON output includes full, untruncated data — the complete catalog entry (`name`, `version`, `namespace`, `description`, `topics[]`, `path`, and any `requires` summary), full lock entries, full `SKILL.md` frontmatter. The consumer (agent or `jq` pipe) decides what to extract. No truncation, no previews. ```bash -# Agent workflow: scan the index, then selectively drill down into one manifest -skillrig search --tag terraform --json | jq '.[].name' # scan names -skillrig search terraform-plan-review --json | jq '.requires' # drill into prereqs +# Agent workflow: scan the index, then selectively drill down into one entry +skillrig search terraform --topic aws --json | jq '.[].name' # scan names +skillrig search terraform-plan-review --json | jq '.[0].requires' # drill into prereqs ``` **Rule**: JSON output is the "execution layer" — complete, structured, pipeable. Human output is the "presentation layer" — budget-conscious, hinted. Token efficiency is achieved by the *workflow pattern* (search → inspect), not by truncating JSON. @@ -233,6 +269,13 @@ A small set of flags carry the same meaning across every command, so an agent ca `--force` and the verify-time label-honesty check are two sides of one rule: divergent content is never written or accepted silently. `--force` is the *human's* deliberate override at write time; `verify` is the *gate's* refusal at check time. +A few command-specific flags carry consistent meaning where they apply: + +| Flag | Applies to | Meaning | +|------|-----------|---------| +| `--pin ` | `add` | Vendor a specific **immutable** version of the skill rather than the origin-branch tip. A bare `^v?SEMVER$` value expands via the origin's `tag_scheme` (e.g. `v1.4.0` → `terraform-plan-review-v1.4.0`); any other value is treated as a literal git ref/SHA. The lock records the resolved `commit` + `treeSha` + the resolved human-readable `version`/tag, so re-acquisition is byte-identical and humans can still reason about versions. A pin that resolves to no ref is a distinct `NoSuchVersionError`, **not** "not found". See [Pin vs. branch ref](#origin-reference-grammar). | +| `--topic ` | `search` | Repeatable exact-string filter applied **after** the free-text `[QUERY...]` match — narrows results to catalog entries carrying topic ``. It is `--topic` (not `--filter`/`--tag`): the catalog field is `topics[]`, renamed from `tags` to avoid colliding with git-tag/version-pin terminology. | + --- ## Origin Reference Grammar @@ -246,7 +289,16 @@ skillrig init --origin my-org/my-skills@staging # track the 'staging' branch This realizes the `@ref` half of the ecosystem-standard identity grammar `OWNER/REPO[/path]@ref` (architecture R26) that `gh skill` (`gh skill install github/awesome-copilot documentation-writer@v1.2.0`) and Vercel `npx skills` use. The `[/path]` portion remains future work. -**Two meanings of `@ref`, kept distinct.** For an **origin**, `@REF` is a *moving pointer* — a branch you track and re-resolve. For a **skill** vendored via `add` (`skillrig add --pin `, **planned** — not in the current slice), the ref is an *immutable* pin — a tag or commit SHA, recorded in the lock so the vendored content is reproducible. Same grammar, opposite intent: the origin says "where to look (and which line of development)"; the pin says "exactly which reviewed bytes." Docs and help text must not conflate them. +**Two meanings of `@ref`, kept distinct.** For an **origin**, `@REF` is a *moving pointer* — a branch you track and re-resolve. For a **skill** vendored via `add` (`skillrig add --pin `, **shipped** with the remote-fetch path), the ref is an *immutable* pin — a tag or commit SHA, recorded in the lock so the vendored content is reproducible. Same grammar, opposite intent: the origin says "where to look (and which line of development)"; the pin says "exactly which reviewed bytes." Docs and help text must not conflate them. + +#### Pin vs. branch ref + +`--pin ` resolves in two steps so the common case (a version) is ergonomic and the escape hatch (any git ref) still works: + +- A **bare semver** (`^v?SEMVER$`, e.g. `1.4.0` or `v1.4.0`) is expanded through the origin's `tag_scheme` (`name-vSEMVER`) to the per-skill tag — `--pin v1.4.0` on `terraform-plan-review` resolves to the tag `terraform-plan-review-v1.4.0`. The fully-qualified tag is also accepted. +- **Anything else** is treated as a **literal git ref or commit SHA** and fetched verbatim. + +A pin that resolves to no ref is a distinct `NoSuchVersionError` (exit 1) — deliberately *not* the same as `NotFoundError` (origin/skill missing), so the agent sees "that version doesn't exist" rather than "that skill doesn't exist". The lock records the resolved `commit` (provenance) + `treeSha` (label-honesty) + the resolved human-readable `version`/tag. The origin publishes **no per-skill tree-SHA** (the catalog is discovery-only), so label-honesty here means "the on-disk content still matches what was vendored at this commit," anchored by provenance — not "matches an origin-attested hash." ### Why a single `@ref` string, not a separate flag @@ -262,19 +314,21 @@ Every `skillrig` subcommand MUST identify which pattern(s) it follows. This clas | Pattern | Purpose | Examples | Constraints | |---------|---------|----------|-------------| -| **Query** | Deterministic read of the discovery artifact | `search` | Offline. Reads committed `index.json`. Deterministic tag filtering — **no inference** (N6). | -| **Vendor Mutation** | Write skill tree + lock entry | `add` *(implemented)*, `bump --pr` | Writes lock via `skillcore` only. Supports `--dry-run`; refuses to clobber content that diverges from the locked `treeSha` without `--force`. `bump` *proposes* (opens a PR), never force-adopts (R13). MUST never silently discard local edits (R32). Vendors byte-identical + mode-preserving; the skill name MUST be a single path segment (no traversal). **Symlinks in a skill subtree are rejected this slice** — following them would break byte-identical / git-canonical vendoring (git records a symlink as a link, not its target); preserving symlinks faithfully is a future relaxation. | -| **Verification Gate** | Offline integrity / prereq / conformance | `verify` *(implemented — integrity-only)*, `lint` | MUST be offline + deterministic. Exit-code driven. **No live/online signal in this path** (R11/N1). `verify` = consumer CI gate; `lint` = author CI gate on the origin. As implemented, `verify` is **integrity-only** (label-honesty + orphan detection, exit 2); prerequisite/eligibility checks (a missing `[[requires]]` tool → exit 3) belong to the future `doctor`, so `verify` does not emit exit 3 today. | +| **Query** | Deterministic read of the discovery artifact | `search` *(implemented)* | Reads the origin's `index.json` (fetched per call — no offline cache this slice; an unreachable origin is the `UnreachableError`). Query-first: deterministic token-AND substring over `name`+`description`+`topics` + exact `--topic` filter; fixed relevance-bucket then lexicographic order — **no inference / no fuzzy ranking** (N6). Empty result = clean exit 0. Gates the origin's `skillrigConvention` before reading. | +| **Vendor Mutation** | Write skill tree + lock entry | `add` *(implemented — local + remote)*, `bump --pr` | Writes lock via `skillcore` only. Serves a **local-path** origin (read a checkout) and a **remote** `OWNER/REPO` origin (fetch the subtree over `git`, token via `os.exec` of `gh`/`git`, never a write credential) — the two origin forms are classified, never "both-present". `--pin` vendors an immutable version. Supports `--dry-run`; refuses to clobber content that diverges from the locked `treeSha` without `--force`. `bump` *proposes* (opens a PR), never force-adopts (R13). MUST never silently discard local edits (R32). Vendors byte-identical + mode-preserving; the skill name MUST be a single path segment (no traversal); **path-traversal + symlink guards apply to remotely-fetched content too**. **Symlinks in a skill subtree are rejected this slice** — following them would break byte-identical / git-canonical vendoring (git records a symlink as a link, not its target); preserving symlinks faithfully is a future relaxation. | +| **Verification Gate** | Offline integrity / prereq / conformance | `verify` *(implemented — integrity-only)*, `lint` | MUST be offline + deterministic. Exit-code driven. **No live/online signal in this path** (R11/N1). `verify` = consumer CI gate; `lint` = author CI gate on the origin. As implemented, `verify` is **integrity-only** (label-honesty + orphan detection, exit 2); prerequisite/eligibility checks (a missing `requires` tool → exit 3) belong to the future `doctor`, so `verify` does not emit exit 3 today. | | **Environment** | Health, auth, config, bootstrap | `doctor`, `init` | MUST be idempotent. `doctor` checks prerequisite auth (R18); works without a fully-configured project. `init` is **consumer-side only** — binds to an *existing* origin, never bootstraps one (architecture §2d). | | **Global Management** | Fetch/restore user-scope skills | `global add`, `global verify` | Genuinely *fetches and materializes* (the restore mode project scope doesn't need, §3). Touches per-environment home dirs, never the repo's project lock (R8). | +> **Origin-side generator — `index` (not a consumer pattern).** `skillrig index` is the only command that runs **inside the origin repo** (in its `index.yml` CI on merge to `main`), not against a consumer's vendored tree. It walks `skills/*/SKILL.md`, parses each via the **same** `skillcore.ParseManifest` the consumer commands use (AP-04), and emits/marshals `index.json` — the catalog `search` consumes. It is **not** one of the five consumer patterns above: it produces the discovery artifact rather than reading or vendoring it. It is still consume-only in the credential sense — no auth, local-filesystem only — so it does not breach AP-05. Constraints: deterministic full-regenerate output (no append/aggregation/GC — single-tip catalog); the producer's output MUST equal the committed `index.json` (a ground-truth contract test); and it MUST **fail clearly** (exit 1) on a skill missing its required `x-skillrig.version` rather than silently under-emitting. Exit codes `0`/`1` only. + ### Failure Mode Constraints Each pattern has a distinct failure mode expectation: | Pattern | Failure Mode | |---------|-------------| -| **Query** | MUST fail with clear error + suggested fix (e.g. no origin → run `init`). | +| **Query** | MUST fail with clear error + suggested fix (no origin → run `init`; unreachable/auth/incompatible-convention fetching the catalog → the matching typed error). An **empty match set is success (exit 0)**, not a failure — it prints a footer hint, not an error. | | **Vendor Mutation** | MUST validate origin + auth before fetching. Three-way-merge conflict → non-zero exit, write git-style conflict markers, instruct resolve-and-rerun (architecture §5b). Never discard local edits. | | **Verification Gate** | MUST be deterministic pass/fail by exit code. Label-honesty mismatch = fail (exit 2); orphan = fail (exit 2); unresolved conflict markers = fail. Prereq miss (exit 3) is reserved for the future `doctor` — the implemented `verify` is integrity-only and does not emit it. | | **Environment** | MUST be idempotent and safe to retry. MUST distinguish "tool missing" from "tool exists but unauthenticated" (R18). | @@ -284,9 +338,9 @@ Each pattern has a distinct failure mode expectation: The product promise — "the skill your agent runs is exactly the version that was reviewed and approved" — rides on `verify` being **offline and deterministic** (architecture §2c, R11). Honor the split: -- **Offline always** (`search`, `verify`, `lint`): operate on committed `index.json` / `skills-lock.json` and the git tree already on disk. The project tree is in git, so there is no "restore from lock" — `verify` only *checks* (§3). Fully offline. -- **Network when fetching** (`add`, `bump --pr`, `global add`): reach the origin / git to vendor or restore content. MUST fail with a clear error when the origin is unreachable. -- **Auth-aware** (`doctor`): explicitly probes `gh auth` / `GITHUB_TOKEN` reachability for private backing-CLI sources and reports auth as its own actionable failure (R18). +- **Offline always** (`verify`, `lint`, `index`): operate on committed `skills-lock.json` + the git tree on disk (`verify`/`lint`) or the origin's local `skills/` tree (`index`). The project tree is in git, so there is no "restore from lock" — `verify` only *checks* (§3). Fully offline; `index` shells `git` only to compute tree-SHAs of a local tree, never the network. +- **Network when fetching** (`add`, `search`, `bump --pr`, `global add`): reach the origin / git to vendor content or read the catalog. `search` fetches the origin's `index.json` **per call** (no offline cache this slice — a deliberate freshness choice, D-catalog-fetch); `add` fetches the skill subtree. MUST fail with the matching typed error (`UnreachableError` / `AuthError` / `NotFoundError`) when the origin can't be reached. *(A **local-path** origin makes `add`/`search` operate on a checkout with no network — the origin-form classification, not a cache.)* +- **Auth-aware** (`add`, `search`, `doctor`): resolve a read token via `os.exec` (`GH_TOKEN` env > `GITHUB_TOKEN` env > `gh auth token --hostname `), inject it into the fetch via `git -c http.extraHeader` (never in the URL), and surface auth as its own actionable failure distinct from unreachable/not-found (R18). The token is a **read-only fetch** credential — there is still no write credential in the binary (AP-05). --- @@ -328,7 +382,7 @@ return fmt.Errorf("add failed: you must run 'gh auth login'") return fmt.Errorf("add failed: %s\n→ Check the origin: 'cat .skillrig/config.toml'\n→ If the repo is private, see 'gh auth status' / GITHUB_TOKEN", stderr) ``` -### AP-04: A parallel tree-SHA / manifest-parse implementation +### AP-04: A parallel tree-SHA / manifest-parse / fetch / catalog implementation ```go // Wrong: bump computes the tree SHA one way, verify recomputes it another way. // They drift, and the value CI writes during bump no longer matches what @@ -341,6 +395,7 @@ sha := someOtherHash(dir) // in verify // hard boundary (architecture §2: "the two interfaces cannot diverge"). sha := skillcore.TreeSHA(dir) ``` +The same single-implementation rule extends to every shared primitive 003 adds: **one** remote-fetch impl, **one** `ParseManifest` (the `SKILL.md` frontmatter reader), **one** catalog parse/generate (`search` reads what `index` writes), and **one** search matcher — all in `pkg/skillcore`. `index` generating a catalog `search` can't parse, or `add` fetching a way `verify` can't reproduce, is the same drift this anti-pattern forbids. ### AP-05: Baking the origin into the binary, or adding a write credential ``` @@ -379,11 +434,11 @@ Inside the CLI, there are two conceptual layers: │ Truncation | Footer hints | stderr/stdout │ ├─────────────────────────────────────────────┤ │ Execution: Go business logic │ ← Cobra routing; skillcore (tree SHA, -│ skillcore | index compare | lock I/O │ manifest parse); index/ compare; lock R/W -└─────────────────────────────────────────────┘ +│ skillcore | fetch | catalog | lock I/O │ frontmatter parse, fetch, catalog, search); +└─────────────────────────────────────────────┘ lock R/W ``` -The execution layer handles command routing, the shared `skillcore` primitives (tree-SHA computation, `skill.toml` / lock parsing), index comparison for `bump`, and lock I/O. The presentation layer formats output for the consumer (human or agent). The presentation/execution split itself is a design concern within each command's `runXxx()` function — but the integrity primitives are **not** inline: `skillcore` is a separate, importable **public package** (`pkg/skillcore`, per SDK-1), so third-party Go tools can build their own `add`/`verify` on the same primitives the CLI uses. +The execution layer handles command routing, the shared `skillcore` primitives (tree-SHA computation, `SKILL.md` frontmatter / lock parsing, remote fetch, catalog parse/generate, the search matcher), index comparison for `bump`, and lock I/O. The presentation layer formats output for the consumer (human or agent). The presentation/execution split itself is a design concern within each command's `runXxx()` function — but the integrity primitives are **not** inline: `skillcore` is a separate, importable **public package** (`pkg/skillcore`, per SDK-1), so third-party Go tools can build their own `add`/`verify` on the same primitives the CLI uses. **Key rule**: Execution logic must not depend on output format. The same data path serves both `--json` and human output. And per AP-04, there is exactly one `skillcore` implementation of the integrity primitives — the public `pkg/skillcore` package — that `add` and `verify` (and future `bump`/`doctor`) all dispatch to, so the gate can never diverge from what CI wrote. If an MCP surface for agents is ever added, it dispatches to `pkg/skillcore` too — never a parallel implementation (architecture §2). diff --git a/go.mod b/go.mod index c0d55a9..cf548cb 100644 --- a/go.mod +++ b/go.mod @@ -5,6 +5,7 @@ go 1.24 require ( github.com/pelletier/go-toml/v2 v2.3.1 github.com/spf13/cobra v1.10.2 + gopkg.in/yaml.v3 v3.0.1 ) require ( diff --git a/go.sum b/go.sum index 899b70d..898b95c 100644 --- a/go.sum +++ b/go.sum @@ -9,4 +9,7 @@ github.com/spf13/cobra v1.10.2/go.mod h1:7C1pvHqHw5A4vrJfjNwvOdzYu0Gml16OCs2GRiT github.com/spf13/pflag v1.0.9 h1:9exaQaMOCwffKiiiYk6/BndUBv+iRViNW+4lEMi0PvY= github.com/spf13/pflag v1.0.9/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg= go.yaml.in/yaml/v3 v3.0.4/go.mod h1:DhzuOOF2ATzADvBadXxruRBLzYTpT36CKvDb3+aBEFg= +gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= +gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= +gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= diff --git a/internal/cli/add.go b/internal/cli/add.go index 5aca3cd..63199bc 100644 --- a/internal/cli/add.go +++ b/internal/cli/add.go @@ -3,6 +3,7 @@ package cli import ( "errors" "fmt" + "os" "path/filepath" "github.com/spf13/cobra" @@ -16,6 +17,7 @@ import ( type addCmd struct { opts *globalOpts skill string + pin string dryRun bool force bool @@ -40,20 +42,29 @@ func newAddCmd(opts *globalOpts) *cobra.Command { Short: "Vendor a skill from your configured origin into .agents/skills/", Long: "Vendor a named skill from this repo's configured origin into the canonical\n" + ".agents/skills//, recording its identity (version, commit, tree-SHA, path)\n" + - "in .skillrig/skills-lock.json. add is offline and consume-only: it resolves the\n" + - "active origin (SKILLRIG_ORIGIN > project > global) exactly like every command and\n" + - "copies the skill byte-identically, injecting nothing.\n\n" + - "Local origin (this release): the configured origin OWNER/REPO is read from a local\n" + - "git checkout at /OWNER/REPO — resolved against the repo root, so add\n" + - "works from any subdirectory — not over the network. So `init --origin my-org/my-skills`\n" + - "expects that library checked out at /my-org/my-skills; keep it out of your\n" + - "index (e.g. echo 'my-org/' >> .git/info/exclude). Fetching a remote origin is a later,\n" + - "additive mode.\n\n" + + "in .skillrig/skills-lock.json. add is consume-only: it resolves the active origin\n" + + "(SKILLRIG_ORIGIN > project > global) exactly like every command and copies the\n" + + "skill byte-identically, injecting nothing.\n\n" + + "Two acquisition forms, chosen automatically and reported in the result:\n" + + " • Local — the configured origin OWNER/REPO is checked out at /OWNER/REPO;\n" + + " add reads that local checkout (resolved against the repo root, so it works from\n" + + " any subdirectory) — no network. Keep the checkout out of your index\n" + + " (e.g. echo 'my-org/' >> .git/info/exclude).\n" + + " • Remote — no local checkout exists; add fetches the skill subtree over git from\n" + + " the origin OWNER/REPO at the origin's @ref (or --pin), using a GitHub token from\n" + + " GH_TOKEN / GITHUB_TOKEN / `gh auth token` when one is available (public origins\n" + + " need none).\n\n" + + "--pin acquires an immutable version instead of the origin's branch tip: a bare\n" + + "semver (v1.4.0 / 1.4.0) expands via the origin's tag scheme to -v;\n" + + "anything else is a literal git ref (a full tag or commit SHA). Both forms of the\n" + + "same release resolve to the same content.\n\n" + "add is idempotent on identical content and refuses to overwrite a vendored skill\n" + "whose on-disk content diverges from the lock unless you pass --force. Requires a\n" + "git repository; commit the result, then run skillrig verify.", - Example: " # Vendor a skill from your configured origin (a local checkout at ./OWNER/REPO)\n" + + Example: " # Vendor a skill from your configured origin\n" + " skillrig add terraform-plan-review\n\n" + + " # Pin an immutable version (bare semver — expands via the origin's tag scheme)\n" + + " skillrig add terraform-plan-review --pin v1.4.0\n\n" + " # Preview what would be vendored, writing nothing\n" + " skillrig add terraform-plan-review --dry-run\n\n" + " # Overwrite a locally-diverged copy with the origin's content\n" + @@ -75,6 +86,7 @@ func newAddCmd(opts *globalOpts) *cobra.Command { }, } + cmd.Flags().StringVar(&ac.pin, "pin", "", "acquire an immutable version: a bare semver (v1.4.0) expands via the origin tag scheme, else a literal tag/SHA") cmd.Flags().BoolVar(&ac.dryRun, "dry-run", false, "report what would be vendored and recorded; write nothing") cmd.Flags().BoolVar(&ac.force, "force", false, "overwrite a vendored skill whose on-disk content diverges from the lock") @@ -104,31 +116,79 @@ func (ac *addCmd) run(cmd *cobra.Command) error { return usageNotGitRepo(addNotGitRepoWhy, err) } - originDir, ref := originDirRef(res.Origin) - // AR-1: anchor the local origin checkout to the repo root, not the process - // CWD. The destination (.agents/skills + the lock) is already repo-root-anchored - // via repoRoot; leaving the origin source relative made `add` resolve it against - // the CWD, so it failed from any subdirectory while the output still went to the - // repo root. Joining with repoRoot makes both sides consistent — `add` now works - // from anywhere in the repo. - originDir = filepath.Join(repoRoot, originDir) - - result, err := skillcore.Add(skillcore.AddOptions{ - OriginDir: originDir, - Ref: ref, - Skill: ac.skill, - RepoRoot: repoRoot, - Origin: res.Origin.String(), - Force: ac.force, - DryRun: ac.dryRun, - }) + opts := addOptionsFor(res.Origin, repoRoot, ac) + + result, err := skillcore.Add(opts) if err != nil { - return mapAddError(ac.skill, err) + return mapAddError(ac.skill, res.Origin.String(), err) } return renderAddResult(cmd.OutOrStdout(), result, ac.opts.json) } +// addOptionsFor classifies the resolved origin's acquisition form (D3) and builds +// the skillcore.AddOptions for it. The form is chosen automatically: +// +// - a file:// / filesystem-path LOCAL origin → remote-fetch form over a real +// git transport against that path (RepoURL = origin.CloneURL(), Local), so a +// local origin and the file:// test substrate are fetched without a checkout; +// - a remote OWNER/REPO checked out at /OWNER/REPO → the 002 +// local-copy form reading that checkout; +// - a remote OWNER/REPO with no checkout → remote-fetch form over GitHub. +// +// --pin and the destination/lock fields are common to all. skillcore gates the +// origin's convention before vendoring in the remote-fetch forms (FIX-4). +// +// AR-1: the local checkout is anchored to the repo root, not the process CWD — the +// destination (.agents/skills + the lock) is already repo-root-anchored via +// repoRoot, so anchoring the source there too makes add work from any subdirectory. +func addOptionsFor(origin config.Origin, repoRoot string, ac *addCmd) skillcore.AddOptions { + _, ref := originDirRef(origin) + + opts := skillcore.AddOptions{ + Ref: ref, + Skill: ac.skill, + RepoRoot: repoRoot, + Origin: origin.String(), + Pin: ac.pin, + Force: ac.force, + DryRun: ac.dryRun, + } + + // A file:// / path LOCAL origin has no OWNER/REPO checkout; fetch it over a + // real git transport from its CloneURL (FR-011 + file:// substrate). + if origin.IsLocal() { + opts.RepoURL = origin.CloneURL() + opts.Local = true + + return opts + } + + originDir, _ := originDirRef(origin) + localCheckout := filepath.Join(repoRoot, originDir) + + if isLocalCheckout(localCheckout) { + opts.OriginDir = localCheckout + + return opts + } + + // Remote form: setting Owner+Repo selects skillcore's remote-fetch path + // (the catalog/conventional skills/ subtree is resolved by skillcore). + opts.Owner = origin.Owner + opts.Repo = origin.Repo + + return opts +} + +// isLocalCheckout reports whether dir is a directory on disk — the signal that the +// origin is checked out locally, selecting the local-copy form over a remote fetch. +func isLocalCheckout(dir string) bool { + info, err := os.Stat(dir) + + return err == nil && info.IsDir() +} + // addNotGitRepoWhy is the project-scope rationale for add's not-a-repo error. const addNotGitRepoWhy = "project-scope add vendors into the repo's canonical .agents/skills " + "and writes a lock that verify checks against git" @@ -166,8 +226,11 @@ func usageNoOriginConfigured() *UsageError { // mapAddError maps skillcore's typed Add errors to navigational *UsageError // values (exit 1), authoring the what/why/fix prose while preserving the raw -// cause for --verbose. An unexpected error is wrapped generically. -func mapAddError(skill string, err error) error { +// cause for --verbose. origin is the OWNER/REPO[@REF] reference, anchored in the +// network/version error prose. An unexpected error is wrapped generically. All +// classes here map to exit 1 — the reserved exit 2 (verification) and 3 +// (prerequisite) are never emitted from add. +func mapAddError(skill, origin string, err error) error { var invalidName *skillcore.InvalidSkillNameError if errors.As(err, &invalidName) { return &UsageError{ @@ -218,6 +281,36 @@ func mapAddError(skill string, err error) error { } } + var remoteNotFound *skillcore.NotFoundError + if errors.As(err, &remoteNotFound) { + return mapNotFoundError(skill, remoteNotFound, err) + } + + var noVersion *skillcore.NoSuchVersionError + if errors.As(err, &noVersion) { + return &UsageError{ + Msg: fmt.Sprintf("%q has no version %q\n", noVersion.Skill, noVersion.Ref) + + "why: the pin does not resolve to a released tag or a commit in the origin\n" + + "fix: run skillrig search for the current version, or --pin an existing tag", + Cause: err, + } + } + + var authErr *skillcore.AuthError + if errors.As(err, &authErr) { + return mapAuthError(origin, err) + } + + var unreachErr *skillcore.UnreachableError + if errors.As(err, &unreachErr) { + return mapUnreachableError(origin, err) + } + + var convErr *skillcore.IncompatibleConventionError + if errors.As(err, &convErr) { + return mapConventionError(origin, convErr, err) + } + var gitErr *skillcore.GitError if errors.As(err, &gitErr) { return &UsageError{ @@ -230,3 +323,65 @@ func mapAddError(skill string, err error) error { return &UsageError{Msg: "add failed\nwhy: " + err.Error(), Cause: err} } + +// mapNotFoundError renders a remote *NotFoundError (the origin or the skill +// subtree is absent) as navigation. The D4 subtlety: GitHub returns "not found" +// (not 403) for a PRIVATE repo reached with no resolved token, so when the fetch +// was unauthenticated the fix adds the "if private, authenticate" hint — the +// agent must not be sent to re-check a skill name when the real problem is a +// missing credential. The raw *GitError is preserved for --verbose. Shared by +// add and search. +func mapNotFoundError(skill string, nf *skillcore.NotFoundError, cause error) error { + what := fmt.Sprintf("skill %q not found in the origin\n", skill) + if skill == "" { + what = "the origin was not found\n" + } + + fix := "fix: run skillrig search to list the skills the origin publishes" + if !nf.Authenticated { + fix += "; if the origin is private, authenticate first (gh auth login, or set GH_TOKEN / GITHUB_TOKEN)" + } + + return &UsageError{ + Msg: what + + "why: no such skill is published, or the origin is private and the fetch was unauthenticated\n" + + fix, + Cause: cause, + } +} + +// mapAuthError renders an *AuthError (a credential WAS presented and the origin +// rejected it — distinct from the not-found-because-private class) as navigation. +// Shared by add and search. +func mapAuthError(origin string, cause error) error { + return &UsageError{ + Msg: fmt.Sprintf("authentication failed reaching %s\n", origin) + + "why: the GitHub token presented was rejected (expired, revoked, or lacking access to a private origin)\n" + + "fix: refresh credentials with gh auth login, or set a valid GH_TOKEN / GITHUB_TOKEN", + Cause: cause, + } +} + +// mapUnreachableError renders an *UnreachableError (the origin host could not be +// resolved or connected to) as navigation. Shared by add and search. +func mapUnreachableError(origin string, cause error) error { + return &UsageError{ + Msg: fmt.Sprintf("could not reach %s\n", origin) + + "why: the origin host could not be resolved or connected to (offline, or a misspelled origin)\n" + + "fix: check your network connection and the origin spelling (OWNER/REPO)", + Cause: cause, + } +} + +// mapConventionError renders an *IncompatibleConventionError as navigation. The +// gate is exact-match (C1): the origin's catalog declares a convention this +// binary does not implement (a higher, lower, or absent/zero value all fail), so +// the fix is to align the tool and the origin. Shared by add and search. +func mapConventionError(origin string, ce *skillcore.IncompatibleConventionError, cause error) error { + return &UsageError{ + Msg: fmt.Sprintf("%s uses skill convention v%d (this tool supports exactly v%d)\n", origin, ce.Found, ce.Supported) + + "why: the origin's catalog declares a convention version this skillrig does not implement\n" + + "fix: update skillrig, or check the origin's .skillrig-origin.toml convention_version", + Cause: cause, + } +} diff --git a/internal/cli/addverify_test.go b/internal/cli/addverify_test.go index 2dbf654..95d854b 100644 --- a/internal/cli/addverify_test.go +++ b/internal/cli/addverify_test.go @@ -76,7 +76,7 @@ func TestMapAddError(t *testing.T) { t.Run(tt.name, func(t *testing.T) { t.Parallel() - got := mapAddError("x", tt.err) + got := mapAddError("x", "my-org/my-skills", tt.err) var ue *UsageError if !errors.As(got, &ue) { diff --git a/internal/cli/index.go b/internal/cli/index.go new file mode 100644 index 0000000..d96e4a5 --- /dev/null +++ b/internal/cli/index.go @@ -0,0 +1,164 @@ +package cli + +import ( + "errors" + "fmt" + "os" + "path/filepath" + + "github.com/spf13/cobra" + + "github.com/skillrig/cli/pkg/skillcore" +) + +// defaultIndexOut is the catalog's default destination: index.json at the origin +// repo root (contract index.md). +const defaultIndexOut = "index.json" + +// indexCmd holds the index command's flags and its injectable cwd seam. +type indexCmd struct { + opts *globalOpts + out string + + // getwd returns the working directory. Defaults to os.Getwd. + getwd func() (string, error) +} + +// newIndexCmd builds the `skillrig index` command (origin-side generator): run +// inside an origin repo, it regenerates the origin's catalog (index.json) from +// each skill's SKILL.md frontmatter so search has an honest, up-to-date catalog. +// It is not a consumer command — it writes the origin's own discovery artifact — +// and is deterministic: regenerating over an unchanged skill set is byte-identical. +func newIndexCmd(opts *globalOpts) *cobra.Command { + ic := &indexCmd{ + opts: opts, + getwd: osGetwd, + } + + cmd := &cobra.Command{ + Use: "index", + Short: "Regenerate the origin's catalog (index.json) from its skills' frontmatter", + Long: "index regenerates the ORIGIN's machine-readable catalog (index.json) by walking\n" + + "the origin's skills directory and parsing each skill's SKILL.md frontmatter — the\n" + + "same parser add and verify use. Run it INSIDE the origin repo (locally or in CI on\n" + + "merge): it is an origin-maintainer command, not a consumer command, and writes the\n" + + "origin's own discovery artifact that search later reads.\n\n" + + "It is a single-tip, full regeneration — the catalog is overwritten wholesale, sorted\n" + + "by name with a stable key order and a trailing newline, so regenerating over an\n" + + "unchanged skill set is byte-identical. The skillrigConvention and origin are read\n" + + "from the origin's .skillrig-origin.toml (never hardcoded), so producer and consumer\n" + + "share one source of truth.\n\n" + + "Exit 0 on a written (or unchanged) catalog; exit 1 outside an origin repo, on an\n" + + "unreadable skills directory, or on a malformed SKILL.md frontmatter.", + Example: " # Regenerate index.json at the origin repo root\n" + + " skillrig index\n\n" + + " # Write the catalog to an explicit path\n" + + " skillrig index --out catalog/index.json\n\n" + + " # Machine-readable summary of what was generated\n" + + " skillrig index --json", + // Custom validator (not cobra.NoArgs) so an extra positional yields + // what/why/fix instead of cobra's "unknown command" dead end (cli.md P1/P2). + Args: func(_ *cobra.Command, args []string) error { + if len(args) != 0 { + return usageIndexArgs(args) + } + + return nil + }, + RunE: func(cmd *cobra.Command, _ []string) error { + return ic.run(cmd) + }, + } + + cmd.Flags().StringVar(&ic.out, "out", defaultIndexOut, "path to write the catalog to (default: index.json at the origin root)") + + return cmd +} + +// run locates the origin repo root, regenerates the catalog from frontmatter via +// skillcore.GenerateCatalog (the one catalog generator, AP-04), writes it to the +// resolved --out path, and renders the summary. skillcore's failures are mapped +// to navigational *UsageError values (exit 1), preserving the raw cause for +// --verbose. +func (ic *indexCmd) run(cmd *cobra.Command) error { + cwd, err := ic.getwd() + if err != nil { + return &UsageError{Msg: "cannot determine working directory\nwhy: " + err.Error(), Cause: err} + } + + originRoot, err := gitToplevel(cmd.Context(), cwd) + if err != nil { + return usageIndexNotInOrigin(err) + } + + data, err := skillcore.GenerateCatalog(originRoot) + if err != nil { + return mapIndexError(err) + } + + // Parse the generated bytes back with the shared parser (AP-04) for the + // summary's skill count and convention — the same numbers a consumer would + // read, never recomputed from a parallel walk. + catalog, err := skillcore.ParseCatalog(data) + if err != nil { + return mapIndexError(err) + } + + outPath := ic.out + if !filepath.IsAbs(outPath) { + outPath = filepath.Join(originRoot, outPath) + } + + if err := os.WriteFile(outPath, data, 0o644); err != nil { //nolint:gosec // G306: index.json is a committed, world-readable catalog artifact. + return &UsageError{ + Msg: fmt.Sprintf("cannot write the catalog to %q\n", outPath) + + "why: " + err.Error() + "\n" + + "fix: check the path is writable, or pass --out to a different location", + Cause: err, + } + } + + return renderIndexResult(cmd.OutOrStdout(), indexResult{ + Out: outPath, + Skills: len(catalog.Skills), + Convention: catalog.SkillrigConvention, + }, ic.opts.json) +} + +// usageIndexArgs builds the navigational usage error when index is given +// positional arguments it does not take (errors-as-navigation: what / why / fix). +func usageIndexArgs(args []string) *UsageError { + return usageErrorf("index takes no arguments\n"+ + "why: it regenerates the whole origin catalog (got: %v)\n"+ + "fix: run skillrig index (add --out to choose where to write)", args) +} + +// usageIndexNotInOrigin builds the "not in an origin repo" usage error (exit 1): +// index must run inside the origin repository to find its .skillrig-origin.toml +// and skills directory. +func usageIndexNotInOrigin(cause error) *UsageError { + return &UsageError{ + Msg: "not in an origin repository\n" + + "why: index regenerates the origin's catalog and needs the origin repo (its .skillrig-origin.toml and skills/)\n" + + "fix: run skillrig index inside the origin repo checkout", + Cause: cause, + } +} + +// mapIndexError maps skillcore.GenerateCatalog's failures to navigational +// *UsageError values (exit 1): a missing/unreadable origin config or skills +// directory means this is not an origin repo; anything else (a malformed +// SKILL.md frontmatter) names the offending file via the wrapped cause. The raw +// cause is preserved for --verbose. +func mapIndexError(err error) error { + if errors.Is(err, os.ErrNotExist) { + return usageIndexNotInOrigin(err) + } + + return &UsageError{ + Msg: "cannot generate the catalog\n" + + "why: " + err.Error() + "\n" + + "fix: check the named SKILL.md frontmatter, the origin's .skillrig-origin.toml, and the skills directory", + Cause: err, + } +} diff --git a/internal/cli/output.go b/internal/cli/output.go index e356d36..7d22da2 100644 --- a/internal/cli/output.go +++ b/internal/cli/output.go @@ -113,6 +113,105 @@ func addSummary(r skillcore.AddResult) string { } } +// searchResultJSON is the complete, untruncated --json view of a search. It +// carries the resolved origin and every matching skill with all the fields add +// needs (name, version, namespace, description, topics, path, requires). It is +// the presentation projection of the matched skillcore.CatalogEntry slice (which +// carries JSON tags of its own, reused here for completeness). +type searchResultJSON struct { + Origin string `json:"origin"` + Skills []skillcore.CatalogEntry `json:"skills"` +} + +// searchDescWidth is the human-output truncation width for a skill's one-line +// description (cli.md Principle 3: compact human output ~80 chars, complete +// --json). +const searchDescWidth = 80 + +// renderSearchResult writes a search outcome to w. With jsonOut it emits one +// complete JSON object (origin + every matching skill, all fields, [] not null +// when empty); otherwise a compact human list — one line per match (name, +// version, truncated description) plus a summary/footer hint — whose line count +// is bounded by the number of matches plus a small constant (Constitution §II). +// An empty result is "no skills matched" and is still success (exit 0). Data goes +// to stdout (the caller passes cmd.OutOrStdout()). +func renderSearchResult(w io.Writer, origin string, matches []skillcore.CatalogEntry, jsonOut bool) error { + if jsonOut { + enc := json.NewEncoder(w) + enc.SetEscapeHTML(false) + + skills := matches + if skills == nil { + skills = []skillcore.CatalogEntry{} + } + + return enc.Encode(searchResultJSON{Origin: origin, Skills: skills}) + } + + if len(matches) == 0 { + _, err := io.WriteString(w, "no skills matched\n"+searchEmptyFooter+"\n") + + return err + } + + var b strings.Builder + + for _, e := range matches { + fmt.Fprintf(&b, "%s %s — %s\n", e.Name, e.Version, truncateDesc(e.Description)) + } + + fmt.Fprintf(&b, "%d skill(s) · run: skillrig add \n", len(matches)) + + _, err := io.WriteString(w, b.String()) + + return err +} + +// searchEmptyFooter is the next-step hint for an empty search result (still exit 0). +const searchEmptyFooter = "→ broaden the query, or run skillrig search with no filter to list all" + +// truncateDesc collapses a description's newlines to spaces and clips it to +// searchDescWidth for compact human output (the complete text is in --json). +func truncateDesc(s string) string { + s = strings.ReplaceAll(s, "\n", " ") + if len(s) <= searchDescWidth { + return s + } + + return s[:searchDescWidth-1] + "…" +} + +// indexResult is the presentation-layer view of an index generation: where the +// catalog was written, how many skills it carries, and the convention it +// declares. It is the single struct both renderers consume. +type indexResult struct { + Out string `json:"out"` + Skills int `json:"skills"` + Convention int `json:"convention"` +} + +// indexFooterHint is the next-step footer for a human index summary. +const indexFooterHint = "→ commit it so search reads the current catalog" + +// renderIndexResult writes an index outcome to w. With jsonOut it emits one +// complete JSON object (out, skills, convention — all keys present); otherwise a +// compact human summary (≤2 lines incl. the footer hint). Data goes to stdout +// (the caller passes cmd.OutOrStdout()). +func renderIndexResult(w io.Writer, r indexResult, jsonOut bool) error { + if jsonOut { + enc := json.NewEncoder(w) + enc.SetEscapeHTML(false) + + return enc.Encode(r) + } + + summary := fmt.Sprintf("indexed %d skill(s) → %s\n", r.Skills, r.Out) + + _, err := io.WriteString(w, summary+indexFooterHint+"\n") + + return err +} + // verifyReportJSON is the complete, untruncated --json view of a verify report. // Top-level keys ok,counts,verdicts are always present; counts carries all five // fields and verdicts every checked skill with all six fields. It is the diff --git a/internal/cli/root.go b/internal/cli/root.go index 7868002..35c5e1a 100644 --- a/internal/cli/root.go +++ b/internal/cli/root.go @@ -108,6 +108,8 @@ func Execute() int { // separate so each user story wires its command here as it lands. func registerSubcommands(root *cobra.Command, opts *globalOpts) { root.AddCommand(newInitCmd(opts)) + root.AddCommand(newSearchCmd(opts)) root.AddCommand(newAddCmd(opts)) root.AddCommand(newVerifyCmd(opts)) + root.AddCommand(newIndexCmd(opts)) } diff --git a/internal/cli/search.go b/internal/cli/search.go new file mode 100644 index 0000000..1aadd75 --- /dev/null +++ b/internal/cli/search.go @@ -0,0 +1,289 @@ +package cli + +import ( + "context" + "errors" + "fmt" + "os" + "path/filepath" + "strings" + + "github.com/spf13/cobra" + + "github.com/skillrig/cli/internal/config" + "github.com/skillrig/cli/pkg/skillcore" +) + +// catalogName is the origin's committed, machine-readable catalog file +// (index.json) that search reads, gates, and matches against (contract search.md). +const catalogName = "index.json" + +// searchCmd holds the search command's flags and its injectable seams. Production +// uses the os-backed defaults; tests inject deterministic stubs (cwd, env). +type searchCmd struct { + opts *globalOpts + query []string + topics []string + + // getwd returns the working directory. Defaults to os.Getwd. + getwd func() (string, error) + // env is the environment accessor used by the origin resolver. + env config.Env +} + +// newSearchCmd builds the `skillrig search [QUERY...]` command (Query pattern): +// discover skills published by the resolved origin by reading its catalog, +// gating the convention version, and matching deterministically. Read-only, +// exit 0 on any well-formed query (including an empty result), exit 1 on a +// config/convention/reachability problem. +func newSearchCmd(opts *globalOpts) *cobra.Command { + sc := &searchCmd{ + opts: opts, + getwd: osGetwd, + env: config.OSEnv, + } + + cmd := &cobra.Command{ + Use: "search [QUERY...]", + Short: "Discover skills published by your configured origin", + // QUERY terms are free-form (a token-AND over name+description+topics), so + // any number of positional args is valid — declare it explicitly rather + // than leave the args contract unstated. + Args: cobra.ArbitraryArgs, + Long: "search discovers the skills your configured origin publishes. It resolves the\n" + + "active origin (SKILLRIG_ORIGIN > project > global) exactly like every command,\n" + + "reads the origin's catalog (index.json), and matches deterministically: a free-text\n" + + "QUERY is a case-insensitive token-AND substring over name+description+topics (a\n" + + "skill matches only if EVERY term is present), and --topic adds an exact-string,\n" + + "case-insensitive AND filter. Results are ordered by a fixed relevance bucket\n" + + "(exact-name > name > topic > description) then by name — no fuzzy or learned\n" + + "ranking. An empty result is success (exit 0); add --json for the complete record.\n\n" + + "search is read-only and needs no git working tree — only a resolvable origin.", + Example: " # List every skill the origin publishes\n" + + " skillrig search\n\n" + + " # Free-text query (token-AND over name + description + topics)\n" + + " skillrig search terraform plan\n\n" + + " # Filter by topic (repeatable; AND across topics)\n" + + " skillrig search --topic aws --topic terraform", + RunE: func(cmd *cobra.Command, args []string) error { + sc.query = args + + return sc.run(cmd) + }, + } + + cmd.Flags().StringArrayVar(&sc.topics, "topic", nil, "filter to skills carrying this topic (repeatable; AND across topics)") + + return cmd +} + +// run resolves the origin, loads + gates the catalog, matches the query, and +// renders the two-level result. skillcore's typed errors are mapped to +// navigational *UsageError values (exit 1), preserving the raw cause for +// --verbose; a well-formed query — even one that matches nothing — is exit 0. +func (sc *searchCmd) run(cmd *cobra.Command) error { + cwd, err := sc.getwd() + if err != nil { + return &UsageError{Msg: "cannot determine working directory\nwhy: " + err.Error(), Cause: err} + } + + res, err := config.ResolveOrigin(cwd, sc.env) + if err != nil { + return &UsageError{Msg: "cannot resolve the active origin\nwhy: " + err.Error() + "\n" + missingOriginFix, Cause: err} + } + + if res.Source == config.SourceNone { + return usageNoOriginConfigured() + } + + // A git repo is OPTIONAL for search: it only enables the local-checkout + // fast-path (reading a remote origin's committed OWNER/REPO checkout off + // disk). Outside a repo — or against a remote/file:// origin with no checkout + // — repoRoot is left empty and loadCatalog fetches the catalog directly, so + // `skillrig search` works from any directory (FIX-7). + repoRoot, err := gitToplevel(cmd.Context(), cwd) + if err != nil { + if !errors.Is(err, errNotGitRepo) { + // An unexpected failure (e.g. context cancellation) is not a "not a + // repo" precondition — surface it rather than silently proceed. + return mapSearchError(res.Origin.String(), err) + } + + repoRoot = "" + } + + catalog, err := loadCatalog(cmd.Context(), repoRoot, res.Origin) + if err != nil { + return mapSearchError(res.Origin.String(), err) + } + + if err := skillcore.CheckConvention(catalog.SkillrigConvention); err != nil { + return mapSearchError(res.Origin.String(), err) + } + + matches := skillcore.Search(catalog, sc.query, sc.topics) + + return renderSearchResult(cmd.OutOrStdout(), res.Origin.String(), matches, sc.opts.json) +} + +// loadCatalog acquires and parses the origin's index.json, choosing the +// transport from the origin form exactly as add does (FIX-2, contract search.md +// step 2). The catalog is fetched PER CALL — never cached — so every search sees +// the origin as it is now: +// +// - bare-path LOCAL origin → read /index.json from disk (no transport); +// - remote OWNER/REPO with a local checkout at /OWNER/REPO → read +// that checkout's index.json (the 002 local-copy form, kept green); +// - otherwise (remote with no checkout, or a file:// origin) → a sparse git +// fetch of index.json at the resolved @ref via skillcore.FetchCatalog (the +// ONE catalog acquisition path, AP-04). +// +// Parse stays separate from the convention gate (run by the caller) so a +// malformed catalog and an incompatible convention are distinct failures. +func loadCatalog(ctx context.Context, repoRoot string, origin config.Origin) (skillcore.Catalog, error) { + if path, ok := localCatalogPath(repoRoot, origin); ok { + return readCatalogFile(path) + } + + catalog, err := skillcore.FetchCatalog(ctx, skillcore.CatalogRequest{ + RepoURL: origin.CloneURL(), + Origin: origin.String(), + Ref: origin.Ref, + Local: origin.IsLocal(), + }) + if err != nil { + return skillcore.Catalog{}, err + } + + return catalog, nil +} + +// localCatalogPath returns the on-disk index.json path to read when the origin's +// catalog is available locally (no transport), and false when it must be +// fetched. A bare-path LOCAL origin (a filesystem path, not a file:// URL) reads +// /index.json directly (independent of repoRoot); a remote OWNER/REPO reads +// its 002 local checkout at /OWNER/REPO only when there IS a repo root +// and that directory exists. A file:// origin, a checkout-less remote, and the +// no-repo case (empty repoRoot, e.g. search run outside a git repo) all return +// false so the caller fetches (FIX-7). +func localCatalogPath(repoRoot string, origin config.Origin) (string, bool) { + if origin.IsLocal() { + if strings.HasPrefix(origin.Path, "file://") { + return "", false + } + + return filepath.Join(origin.Path, catalogName), true + } + + // The remote local-checkout fast-path is anchored at the repo root; with no + // repo (empty root) there is no checkout to read, so fetch. + if repoRoot == "" { + return "", false + } + + originDir, _ := originDirRef(origin) + checkout := filepath.Join(repoRoot, originDir) + + if isLocalCheckout(checkout) { + return filepath.Join(checkout, catalogName), true + } + + return "", false +} + +// readCatalogFile reads and parses a local index.json, tagging the read and parse +// failures distinctly so mapSearchError can author the right what/why/fix. +func readCatalogFile(catalogPath string) (skillcore.Catalog, error) { + //nolint:gosec // G304: path is the resolved origin path/checkout + a fixed file name, not attacker-controlled. + data, err := os.ReadFile(catalogPath) + if err != nil { + return skillcore.Catalog{}, &catalogReadError{path: catalogPath, cause: err} + } + + catalog, err := skillcore.ParseCatalog(data) + if err != nil { + return skillcore.Catalog{}, &catalogParseError{path: catalogPath, cause: err} + } + + return catalog, nil +} + +// catalogReadError marks the origin's catalog as unreadable (absent or no +// permission). It is presentation-free here only in that mapSearchError renders +// the what/why/fix; it carries the path and raw cause for --verbose. +type catalogReadError struct { + path string + cause error +} + +func (e *catalogReadError) Error() string { + return fmt.Sprintf("reading catalog %q: %v", e.path, e.cause) +} +func (e *catalogReadError) Unwrap() error { return e.cause } + +// catalogParseError marks the origin's catalog as malformed JSON. +type catalogParseError struct { + path string + cause error +} + +func (e *catalogParseError) Error() string { + return fmt.Sprintf("parsing catalog %q: %v", e.path, e.cause) +} +func (e *catalogParseError) Unwrap() error { return e.cause } + +// mapSearchError maps the failure classes search can surface to navigational +// *UsageError values (exit 1), authoring the what/why/fix prose while preserving +// the raw cause for --verbose. The convention mismatch, an unreachable/auth +// origin, and a malformed catalog are distinct messages so the agent debugs the +// real problem (errors-as-navigation; do not conflate look-alike classes). +func mapSearchError(origin string, err error) error { + var convErr *skillcore.IncompatibleConventionError + if errors.As(err, &convErr) { + return mapConventionError(origin, convErr, err) + } + + var authErr *skillcore.AuthError + if errors.As(err, &authErr) { + return mapAuthError(origin, err) + } + + var unreachErr *skillcore.UnreachableError + if errors.As(err, &unreachErr) { + return mapUnreachableError(origin, err) + } + + var notFound *skillcore.NotFoundError + if errors.As(err, ¬Found) { + return mapNotFoundError("", notFound, err) + } + + var readErr *catalogReadError + if errors.As(err, &readErr) { + return &UsageError{ + Msg: fmt.Sprintf("cannot read the origin catalog at %q\n", readErr.path) + + "why: the origin has no index.json there (origin not checked out, or its catalog has not been generated)\n" + + "fix: check out the origin at that path, or run skillrig index in the origin and commit index.json", + Cause: err, + } + } + + var parseErr *catalogParseError + if errors.As(err, &parseErr) { + return &UsageError{ + Msg: fmt.Sprintf("the origin catalog at %q is malformed\n", parseErr.path) + + "why: index.json is not valid JSON\n" + + "fix: regenerate it with skillrig index in the origin, then commit the result", + Cause: err, + } + } + + // A not-a-repo failure is already a *UsageError from loadCatalog; pass typed + // usage errors through untouched so their authored prose survives. + var usageErr *UsageError + if errors.As(err, &usageErr) { + return usageErr + } + + return &UsageError{Msg: "search failed\nwhy: " + err.Error(), Cause: err} +} diff --git a/internal/config/origin.go b/internal/config/origin.go index c8a9bd3..f820701 100644 --- a/internal/config/origin.go +++ b/internal/config/origin.go @@ -5,6 +5,8 @@ package config import ( "fmt" + "os" + "path/filepath" "regexp" "strings" ) @@ -37,21 +39,63 @@ func (e *InvalidOriginError) Error() string { return fmt.Sprintf("invalid origin %q", e.Value) } -// Origin is an org's skill source in OWNER/REPO[@REF] form. It is the single -// value this feature reads, validates, records, and resolves. Ref is optional: -// when set it pins the origin to a branch (for an origin, a moving pointer the -// consumer tracks — distinct from the immutable tag/SHA a *skill* is pinned to); -// an empty Ref means the origin's default branch. +// Origin is an org's skill source in one of two forms (D3): +// +// - REMOTE: OWNER/REPO[@REF] — Owner and Repo are set, Path is empty. Ref is +// optional: when set it pins the origin to a branch (a moving pointer the +// consumer tracks — distinct from the immutable tag/SHA a *skill* is pinned +// to); an empty Ref means the origin's default branch. +// - LOCAL: an explicit filesystem path or a file:// URL — Path is set, Owner +// and Repo are empty (FR-011). This points at a local checkout/bare repo so +// fetches read from disk instead of github.com; it is also the file:// +// substrate the remote-fetch path is tested against. +// +// The two forms are mutually exclusive. IsLocal reports which one a value is; +// CloneURL renders the git transport target for either. type Origin struct { Owner string Repo string Ref string + // Path is the local origin's filesystem path or file:// URL (LOCAL form). + // Empty for the remote OWNER/REPO form. + Path string } -// String renders the origin as "Owner/Repo", appending "@Ref" when a ref is -// set. The zero Origin (the SourceNone sentinel) renders as "" so a "no origin" -// result never stringifies to a misleading "/" that looks configured. +// IsLocal reports whether the origin is the LOCAL form (a filesystem path or +// file:// URL) rather than the remote OWNER/REPO form. The two are mutually +// exclusive, so a non-empty Path is the discriminant. +func (o Origin) IsLocal() bool { + return o.Path != "" +} + +// CloneURL renders the git transport target the fetch layer clones from. For a +// LOCAL origin it is the Path as a file:// URL (a bare/working-tree path becomes +// file://, an already-file:// path passes through), so git runs a real +// transport handshake offline; for a REMOTE origin it is the GitHub HTTPS clone +// URL. The token (remote only) is never embedded here — git.go injects it via +// http.extraHeader — so the result is safe to surface in diagnostics. +func (o Origin) CloneURL() string { + if o.IsLocal() { + if strings.HasPrefix(o.Path, "file://") { + return o.Path + } + + return "file://" + o.Path + } + + return "https://github.com/" + o.Owner + "/" + o.Repo + ".git" +} + +// String renders the origin to its canonical configured form: a LOCAL origin is +// its Path verbatim (round-trips through ParseOrigin); a REMOTE origin is +// "Owner/Repo" with "@Ref" appended when a ref is set. The zero Origin (the +// SourceNone sentinel) renders as "" so a "no origin" result never stringifies +// to a misleading "/" that looks configured. func (o Origin) String() string { + if o.IsLocal() { + return o.Path + } + if o.Owner == "" && o.Repo == "" { return "" } @@ -64,17 +108,33 @@ func (o Origin) String() string { return s } -// ParseOrigin trims surrounding whitespace and validates s against the -// OWNER/REPO[@REF] shape. The optional ref is split on the first '@' (the -// owner/repo charset excludes '@', so the split is unambiguous) and validated -// against refPattern; a trailing '@' with no ref is rejected. On failure it -// returns a typed *InvalidOriginError carrying the offending value; the -// user-facing expected-format guidance is rendered by internal/cli (FR-012). A -// blank string is rejected; callers that treat blank as "unset" (e.g. -// SKILLRIG_ORIGIN) must check before calling. +// ParseOrigin trims surrounding whitespace and classifies s into one of the two +// origin forms (D3, FR-011): +// +// - LOCAL: a file:// URL, or a filesystem path (absolute "/…", explicit +// "./"/"../", or "~"-rooted). These yield an Origin with Path set; a "~" +// prefix is expanded against $HOME. No @REF split is applied — a local path +// may legitimately contain '@', and the local form has no origin-level ref. +// - REMOTE: bare OWNER/REPO[@REF]. The optional ref is split on the first '@' +// (the owner/repo charset excludes '@', so the split is unambiguous) and +// validated against refPattern; a trailing '@' with no ref is rejected. +// +// On failure it returns a typed *InvalidOriginError carrying the offending +// value; the user-facing expected-format guidance is rendered by internal/cli +// (FR-012). A blank string is rejected; callers that treat blank as "unset" +// (e.g. SKILLRIG_ORIGIN) must check before calling. func ParseOrigin(s string) (Origin, error) { trimmed := strings.TrimSpace(s) + if isLocalForm(trimmed) { + path, err := normalizeLocalPath(trimmed) + if err != nil { + return Origin{}, &InvalidOriginError{Value: s} + } + + return Origin{Path: path}, nil + } + ownerRepo, ref, hasRef := strings.Cut(trimmed, "@") if !originPattern.MatchString(ownerRepo) { return Origin{}, &InvalidOriginError{Value: s} @@ -88,3 +148,38 @@ func ParseOrigin(s string) (Origin, error) { return Origin{Owner: owner, Repo: repo, Ref: ref}, nil } + +// isLocalForm reports whether s is the LOCAL origin form: a file:// URL or a +// path-shaped value (absolute, explicit-relative, or "~"-rooted). A bare +// OWNER/REPO stays the remote form — only these unambiguous path markers select +// local, so the two forms never collide. +func isLocalForm(s string) bool { + return strings.HasPrefix(s, "file://") || + strings.HasPrefix(s, "/") || + strings.HasPrefix(s, "./") || + strings.HasPrefix(s, "../") || + strings.HasPrefix(s, "~/") || + s == "~" +} + +// normalizeLocalPath canonicalizes a LOCAL origin value to an absolute path or a +// file:// URL. A file:// URL passes through unchanged; a "~"/"~/" prefix is +// expanded against $HOME; a relative path is made absolute against the working +// directory so the recorded origin is stable regardless of where a later command +// runs. The git transport (CloneURL) turns a bare path into file://. +func normalizeLocalPath(s string) (string, error) { + if strings.HasPrefix(s, "file://") { + return s, nil + } + + if s == "~" || strings.HasPrefix(s, "~/") { + home, err := os.UserHomeDir() + if err != nil { + return "", err + } + + s = filepath.Join(home, strings.TrimPrefix(strings.TrimPrefix(s, "~"), "/")) + } + + return filepath.Abs(s) +} diff --git a/internal/config/origin_test.go b/internal/config/origin_test.go index 33b2338..3bd39db 100644 --- a/internal/config/origin_test.go +++ b/internal/config/origin_test.go @@ -16,6 +16,10 @@ func TestParseOrigin(t *testing.T) { wantOwner string wantRepo string wantRef string + // wantLocal asserts the LOCAL form (FIX-1): a non-empty wantPath suffix the + // resolved Path must end with (the path is made absolute, so an exact match + // is environment-dependent — assert the suffix). Owner/Repo must be empty. + wantPathSuffix string }{ {name: "valid", in: "my-org/my-skills", wantOwner: "my-org", wantRepo: "my-skills"}, {name: "valid with dots and underscores", in: "my.org_1/skills.v2_x", wantOwner: "my.org_1", wantRepo: "skills.v2_x"}, @@ -25,11 +29,16 @@ func TestParseOrigin(t *testing.T) { {name: "commit ref", in: "my-org/my-skills@9f2c1a0", wantOwner: "my-org", wantRepo: "my-skills", wantRef: "9f2c1a0"}, {name: "branch ref with slash", in: "my-org/my-skills@feature/auth", wantOwner: "my-org", wantRepo: "my-skills", wantRef: "feature/auth"}, {name: "ref with surrounding whitespace trimmed", in: " my-org/my-skills@staging\n", wantOwner: "my-org", wantRepo: "my-skills", wantRef: "staging"}, + // FIX-1 local forms: an absolute path and a file:// URL are the LOCAL form + // (Path set, Owner/Repo empty) — the seam for FR-011 and the file:// test + // substrate. The absolute path "/my-skills" is no longer an error (it was + // the old "missing owner" remote case). + {name: "absolute path is local", in: "/my-skills", wantPathSuffix: "/my-skills"}, + {name: "file url is local", in: "file:///tmp/origin.git", wantPathSuffix: "file:///tmp/origin.git"}, {name: "empty", in: "", wantErr: true}, {name: "blank whitespace", in: " ", wantErr: true}, {name: "no slash", in: "my-org-my-skills", wantErr: true}, {name: "missing repo", in: "my-org/", wantErr: true}, - {name: "missing owner", in: "/my-skills", wantErr: true}, {name: "too many segments", in: "my-org/team/skills", wantErr: true}, {name: "illegal char", in: "my org/my skills", wantErr: true}, {name: "trailing at with empty ref", in: "my-org/my-skills@", wantErr: true}, @@ -55,6 +64,18 @@ func TestParseOrigin(t *testing.T) { t.Fatalf("ParseOrigin(%q) unexpected error: %v", tc.in, err) } + if tc.wantPathSuffix != "" { + if got.Owner != "" || got.Repo != "" { + t.Errorf("ParseOrigin(%q) = %+v, want local form (empty Owner/Repo)", tc.in, got) + } + + if !got.IsLocal() || !strings.HasSuffix(got.Path, tc.wantPathSuffix) { + t.Errorf("ParseOrigin(%q).Path = %q, want suffix %q (IsLocal=%v)", tc.in, got.Path, tc.wantPathSuffix, got.IsLocal()) + } + + return + } + if got.Owner != tc.wantOwner || got.Repo != tc.wantRepo || got.Ref != tc.wantRef { t.Errorf("ParseOrigin(%q) = %+v, want {Owner:%q Repo:%q Ref:%q}", tc.in, got, tc.wantOwner, tc.wantRepo, tc.wantRef) } diff --git a/pkg/skillcore/add.go b/pkg/skillcore/add.go index 2993eab..029a9ee 100644 --- a/pkg/skillcore/add.go +++ b/pkg/skillcore/add.go @@ -2,11 +2,13 @@ package skillcore import ( "bytes" + "context" "fmt" "io" "io/fs" "os" "path/filepath" + "regexp" "strings" ) @@ -25,10 +27,19 @@ const ( // vendorRoot is the canonical, repo-relative root every skill is vendored under. const vendorRoot = ".agents/skills" -// AddOptions configures Add. The caller supplies an already-resolved local -// origin checkout (OriginDir + Ref); skillcore neither resolves origins, reads -// config, nor fetches. +// AddOptions configures Add. The CLI supplies an already-resolved origin and +// has classified its form (local-path vs remote OWNER/REPO) by populating the +// coordinates below; skillcore neither resolves origins nor reads config. +// +// Form selection: a non-empty Owner AND Repo selects the REMOTE form (Add +// fetches the skill subtree over git from https://github.com/Owner/Repo); +// otherwise Add uses the LOCAL-PATH form against the OriginDir checkout (002 +// behavior, unchanged). The two forms are mutually exclusive — the remote +// coordinates are simply absent for a local origin. type AddOptions struct { + // OriginDir and Ref drive the LOCAL-PATH form: OriginDir is the local origin + // checkout, Ref the revision to read (empty → HEAD upstream). Ignored when + // remote coordinates are set. OriginDir string Ref string Skill string @@ -38,10 +49,55 @@ type AddOptions struct { // top-level origin field; it does not parse or resolve it (presentation- and // resolution-free). Empty leaves any existing lock origin untouched. Origin string + // Owner and Repo are the remote origin's OWNER/REPO halves. They name the + // origin in error reporting and, when RepoURL is empty, derive the GitHub + // HTTPS clone URL. They are empty for a file:// local origin (which carries + // only RepoURL). + Owner string + Repo string + // RepoURL is the git transport target for the remote-fetch form when set — + // the origin's file:// (FR-011, the file:// test substrate) or any + // caller-supplied URL. Empty means "derive https://github.com/Owner/Repo.git". + // The CLI fills it from config.Origin.CloneURL(). + RepoURL string + // Local marks RepoURL as a file:// (local) target: no GitHub token is + // resolved for the fetch, and its failures are never the remote + // auth/private-not-found classes. The CLI sets it from config.Origin.IsLocal(). + Local bool + // SkillPath is the repo-relative subtree to fetch in the remote form (the + // catalog's path, e.g. "skills/"). Empty defaults to the conventional + // skills/. Unused by the local-path form. + SkillPath string + // Pin is the optional --pin reference for the remote form: a bare semver + // (^v?SEMVER$) is expanded via the name-vSEMVER tag scheme to + // -v; anything else is a literal git ref or commit SHA passed + // through unchanged (C3). Empty means "use Ref" (the origin @ref branch). + Pin string Force bool DryRun bool } +// isRemote reports whether opts selects the remote-fetch form: an explicit +// RepoURL (the file:// local origin / test substrate) OR both OWNER and REPO +// halves of a remote origin reference. The two forms are mutually exclusive, so +// neither marker present means the LOCAL-PATH checkout form (002). +func (opts AddOptions) isRemote() bool { + return opts.RepoURL != "" || (opts.Owner != "" && opts.Repo != "") +} + +// cloneURL derives the git transport target for the remote-fetch form: an +// explicit RepoURL (the file:// origin) when set, else the GitHub HTTPS URL for +// OWNER/REPO. The token is never embedded here — git.go injects it via the +// GIT_CONFIG http.extraHeader env (kept out of argv) — so the URL is safe to +// surface in diagnostics. +func (opts AddOptions) cloneURL() string { + if opts.RepoURL != "" { + return opts.RepoURL + } + + return "https://" + defaultGitHubHost + "/" + opts.Owner + "/" + opts.Repo + ".git" +} + // AddResult reports what Add did, for the CLI to render. type AddResult struct { Name string @@ -77,30 +133,47 @@ func (e *OverwriteError) Error() string { return fmt.Sprintf("refusing to overwrite %q", e.Path) } -// Add vendors one skill from the local origin at opts.OriginDir into -// opts.RepoRoot's canonical .agents/skills//, byte-identical and -// mode-preserving, then writes/updates the lock. It refuses a divergent -// overwrite unless opts.Force, writes nothing when opts.DryRun, and is +// acquired is the form-independent result of locating a skill's source: the +// on-disk directory to vendor from, its git-canonical tree-SHA, the resolved +// upstream commit, and the human-readable version/tag to record in the lock. +// Both the local-path and remote-fetch forms produce one, so the vendor + lock +// path downstream is identical (AP-04: one byte-identical vendor, one tree-SHA, +// one lock writer). +type acquired struct { + srcDir string + treeSha string + commit string + version string + // cleanup removes any temp checkout the acquisition created (remote form). Add + // always defers it, so it is never nil — the local-path form returns a no-op + // (it vendors from the in-repo origin checkout, with nothing to remove). + cleanup func() +} + +// Add vendors one skill into opts.RepoRoot's canonical .agents/skills//, +// byte-identical and mode-preserving, then writes/updates the lock. The source +// is the local origin checkout at opts.OriginDir (local-path form) or a remote +// OWNER/REPO fetched over git (remote form, when opts.Owner and opts.Repo are +// set); both forms converge on the same vendor + lock path. It refuses a +// divergent overwrite unless opts.Force, writes nothing when opts.DryRun, and is // idempotent on identical content. func Add(opts AddOptions) (AddResult, error) { - srcDir, err := prepareSource(opts) - if err != nil { + if err := validateSkillName(opts.Skill); err != nil { return AddResult{}, err } - manifest, err := ParseManifest(filepath.Join(srcDir, "skill.toml")) + acq, err := acquireSource(opts) if err != nil { return AddResult{}, err } - originRelPath := "skills/" + opts.Skill + defer acq.cleanup() - treeSha, err := TreeSHA(opts.OriginDir, opts.Ref, originRelPath) - if err != nil { + if err := ensureNoSymlinks(acq.srcDir); err != nil { return AddResult{}, err } - commit, err := revParse(opts.OriginDir, opts.Ref) + manifest, err := ParseManifest(filepath.Join(acq.srcDir, "SKILL.md")) if err != nil { return AddResult{}, err } @@ -108,17 +181,22 @@ func Add(opts AddOptions) (AddResult, error) { destPath := vendorRoot + "/" + opts.Skill destDir := filepath.Join(opts.RepoRoot, ".agents", "skills", opts.Skill) - action, err := resolvePlacement(opts, manifest.Name, srcDir, destDir, treeSha) + action, err := resolvePlacement(opts, manifest.Name, acq.srcDir, destDir, acq.treeSha) if err != nil { return AddResult{}, err } + version := manifest.Version + if acq.version != "" { + version = acq.version + } + result := AddResult{ Name: manifest.Name, - Version: manifest.Version, + Version: version, Path: destPath, - Commit: commit, - TreeSha: treeSha, + Commit: acq.commit, + TreeSha: acq.treeSha, Action: action, DryRun: opts.DryRun, } @@ -133,7 +211,7 @@ func Add(opts AddOptions) (AddResult, error) { } } - if err := copyTreePreservingModes(srcDir, destDir); err != nil { + if err := copyTreePreservingModes(acq.srcDir, destDir); err != nil { return AddResult{}, err } @@ -144,28 +222,160 @@ func Add(opts AddOptions) (AddResult, error) { return result, nil } -// prepareSource validates the skill name, locates the origin skill subtree, and -// rejects any symlink within it — all the safety pre-flight before Add touches -// the filesystem. opts.Skill is used as a path segment for the source, the -// destination, and os.RemoveAll on overwrite, so a traversal name (e.g. "../x") -// must be refused here, before any copy or delete can escape the canonical -// subtree; a symlink would let copy/compare follow it outside the subtree and -// break byte-identical/git-canonical vendoring. -func prepareSource(opts AddOptions) (string, error) { - if err := validateSkillName(opts.Skill); err != nil { - return "", err +// acquireSource locates the skill's source for the selected form, computing the +// git-canonical tree-SHA, the resolved upstream commit, and (for a remote pin) +// the resolved version/tag. The skill name is validated by the caller (Add) +// before this runs, so opts.Skill is already a safe single path segment. +func acquireSource(opts AddOptions) (acquired, error) { + if opts.isRemote() { + return acquireRemote(opts) } + return acquireLocal(opts) +} + +// acquireLocal is the 002 local-path form: vendor from the in-repo origin +// checkout at opts.OriginDir. The tree-SHA and commit come from the local git +// repo; the version is the manifest's (acquired.version left empty so Add reads +// the manifest). The cleanup is a no-op — nothing temporary was created. +func acquireLocal(opts AddOptions) (acquired, error) { srcDir, err := locateSkillSource(opts) if err != nil { - return "", err + return acquired{}, err } - if err := ensureNoSymlinks(srcDir); err != nil { - return "", err + originRelPath := "skills/" + opts.Skill + + treeSha, err := TreeSHA(opts.OriginDir, opts.Ref, originRelPath) + if err != nil { + return acquired{}, err } - return srcDir, nil + commit, err := revParse(opts.OriginDir, opts.Ref) + if err != nil { + return acquired{}, err + } + + return acquired{ + srcDir: srcDir, + treeSha: treeSha, + commit: commit, + cleanup: func() {}, + }, nil +} + +// acquireRemote is the remote-fetch form. Before vendoring it GATES the origin's +// convention (FIX-4 / H1): it fetches the origin's index.json at the origin's +// @ref and CheckConventions its skillrigConvention EXACT-match (== 1, C1), +// returning *IncompatibleConventionError when the origin speaks a convention this +// binary does not implement — so a mismatching origin is refused before any +// subtree is fetched or written. It then resolves the pin to a concrete ref, +// fetches the skill subtree from the origin via FetchSkill (the ONE fetch impl, +// AP-04), and computes the tree-SHA over the fetched checkout with the same +// TreeSHA verify uses. The fetched temp dir is the caller's to remove, so the +// returned cleanup removes it. +// +// version records the human-readable label for the lock: the resolved tag when a +// pin was given (so `--pin v1.4.0` is honestly recorded as the tag it resolved +// to), otherwise empty so Add falls back to the manifest version at the fetched +// ref (data-model §3). +func acquireRemote(opts AddOptions) (acquired, error) { + if err := gateRemoteConvention(opts); err != nil { + return acquired{}, err + } + + skillPath := opts.SkillPath + if skillPath == "" { + skillPath = "skills/" + opts.Skill + } + + ref, version, pinned := resolvePin(opts.Skill, opts.Pin, opts.Ref) + + fetch, err := FetchSkill(context.Background(), FetchRequest{ + Owner: opts.Owner, + Repo: opts.Repo, + RepoURL: opts.cloneURL(), + Local: opts.Local, + Skill: opts.Skill, + SkillPath: skillPath, + Ref: ref, + Pinned: pinned, + }) + if err != nil { + return acquired{}, err + } + + cleanup := func() { _ = os.RemoveAll(fetch.Dir) } + + treeSha, err := TreeSHA(fetch.Dir, fetch.Commit, skillPath) + if err != nil { + cleanup() + + return acquired{}, err + } + + return acquired{ + srcDir: filepath.Join(fetch.Dir, filepath.FromSlash(skillPath)), + treeSha: treeSha, + commit: fetch.Commit, + version: version, + cleanup: cleanup, + }, nil +} + +// gateRemoteConvention fetches the origin's index.json at the origin's @ref (NOT +// the --pin, which addresses a skill tag, not the origin) through the one catalog +// acquisition path (FetchCatalog, AP-04) and enforces the exact-match convention +// gate (CheckConvention, C1) before any subtree is fetched or vendored (FIX-4). +// A convention mismatch surfaces as *IncompatibleConventionError, which the CLI +// maps via mapConventionError; a fetch failure surfaces as the same +// Auth/Unreachable/NotFound classes FetchCatalog already classifies, anchored on +// the origin identity. The catalog is fetched PER add — never cached. +func gateRemoteConvention(opts AddOptions) error { + catalog, err := FetchCatalog(context.Background(), CatalogRequest{ + RepoURL: opts.cloneURL(), + Origin: opts.Origin, + Ref: opts.Ref, + Local: opts.Local, + }) + if err != nil { + return err + } + + return CheckConvention(catalog.SkillrigConvention) +} + +// bareSemver matches a bare semantic version, optionally v-prefixed (e.g. +// "v1.4.0" or "1.4.0"). A pin matching this is expanded via the name-vSEMVER tag +// scheme; anything else (a full tag, a commit SHA) is taken literally (C3). +var bareSemver = regexp.MustCompile(`^v?[0-9]+\.[0-9]+\.[0-9]+$`) + +// resolvePin maps a --pin value to the concrete git ref to fetch, the +// human-readable version/tag to record in the lock, and whether a pin was given +// (so a failed fetch of a pinned ref classifies as NoSuchVersionError, not a +// missing skill — C2). The single deterministic rule (C3): +// +// - empty pin → fetch the origin's branch ref (fallbackRef); record no +// explicit version (Add reads the manifest); not pinned. +// - bare semver (^v?SEMVER^$) → expand via the name-vSEMVER tag scheme to +// -v; that tag is both the ref and the recorded version. +// - any other value → a literal git ref or commit SHA, passed through as both +// the ref and the recorded version. +// +// So `--pin v1.4.0` and `--pin -v1.4.0` resolve to the same tag and thus +// the same commit/treeSha (SC-004). +func resolvePin(skill, pin, fallbackRef string) (ref, version string, pinned bool) { + if pin == "" { + return fallbackRef, "", false + } + + if bareSemver.MatchString(pin) { + tag := skill + "-v" + strings.TrimPrefix(pin, "v") + + return tag, tag, true + } + + return pin, pin, true } // validateSkillName rejects any skill name that is not a single safe path diff --git a/pkg/skillcore/add_test.go b/pkg/skillcore/add_test.go index 56bf86f..9f47870 100644 --- a/pkg/skillcore/add_test.go +++ b/pkg/skillcore/add_test.go @@ -111,23 +111,23 @@ func TestAdd_Idempotent(t *testing.T) { } } -// TestAdd_IdempotentWhenManifestNameDiffersFromDir guards R2-H1: the lock is -// keyed by the manifest name, so the placement guard must look up the recorded -// fingerprint by that name too — not by the directory arg. data-model only says -// the leaf SHOULD equal the name, so a dir "tf-review" with manifest name -// "terraform-plan-review" is legal; before the fix an identical re-add was -// wrongly refused with an *OverwriteError (FR-003 violation). -func TestAdd_IdempotentWhenManifestNameDiffersFromDir(t *testing.T) { +// TestAdd_IdempotentByManifestName guards R2-H1: the lock is keyed by the +// manifest name, so the placement guard must look up the recorded fingerprint by +// that name; an identical re-add is idempotent (ActionUnchanged), not wrongly +// refused with an *OverwriteError (FR-003). Post-manifest-migration the parse +// contract requires name == directory (data-model §1: removes 002's name/dir +// drift), so the fixture directory equals the manifest name. +func TestAdd_IdempotentByManifestName(t *testing.T) { t.Parallel() originDir := t.TempDir() runGit(t, originDir, "init", "-q") - const dirName = "tf-review" // != manifest name "terraform-plan-review" + const dirName = "terraform-plan-review" // == manifest name (parse contract) writeFile(t, originDir, filepath.Join("skills", dirName, "SKILL.md"), 0o644, sampleSkillMd) writeFile(t, originDir, filepath.Join("skills", dirName, "skill.toml"), 0o644, sampleManifest) runGit(t, originDir, "add", "-A") - runGit(t, originDir, "commit", "-q", "-m", "seed dir!=name skill") + runGit(t, originDir, "commit", "-q", "-m", "seed skill") consumer := newConsumer(t) @@ -141,7 +141,7 @@ func TestAdd_IdempotentWhenManifestNameDiffersFromDir(t *testing.T) { } if second.Action != ActionUnchanged { - t.Errorf("Action = %q, want %q (name!=dir must not false-refuse)", second.Action, ActionUnchanged) + t.Errorf("Action = %q, want %q (must not false-refuse)", second.Action, ActionUnchanged) } } diff --git a/pkg/skillcore/auth_test.go b/pkg/skillcore/auth_test.go new file mode 100644 index 0000000..447edb7 --- /dev/null +++ b/pkg/skillcore/auth_test.go @@ -0,0 +1,125 @@ +package skillcore + +import ( + "context" + "encoding/base64" + "os/exec" + "slices" + "strings" + "testing" +) + +// TestAuthConfigEnv pins the token-injection transport seam (F2, research D4): +// an empty token yields NO env (an unauthenticated fetch), and a non-empty token +// yields exactly the GIT_CONFIG_* triple that injects an http.extraHeader Basic +// credential via the ENVIRON, never argv. The credential lives only in +// GIT_CONFIG_VALUE_0 as base64("x-access-token:") — out of `ps` reach. +func TestAuthConfigEnv(t *testing.T) { + t.Parallel() + + t.Run("empty token yields no env", func(t *testing.T) { + t.Parallel() + + if got := authConfigEnv(""); len(got) != 0 { + t.Errorf("authConfigEnv(\"\") = %v, want no env", got) + } + }) + + t.Run("non-empty token yields the exact GIT_CONFIG triple", func(t *testing.T) { + t.Parallel() + + const token = "TKN" + + wantB64 := base64.StdEncoding.EncodeToString([]byte("x-access-token:" + token)) + want := []string{ + "GIT_CONFIG_COUNT=1", + "GIT_CONFIG_KEY_0=http.extraHeader", + "GIT_CONFIG_VALUE_0=Authorization: Basic " + wantB64, + } + + got := authConfigEnv(token) + if !slices.Equal(got, want) { + t.Errorf("authConfigEnv(%q) = %v, want %v", token, got, want) + } + }) +} + +// captureCommandContext returns a commandContext that records each git +// invocation's full argv into *capture AND retains the produced *exec.Cmd into +// *cmds, then routes into the helper-process stub (exit 0) so the gitClient runs +// without a real git. Retaining the *exec.Cmd lets the test inspect cmd.Env AFTER +// run/runEnv has set it — the F2 seam for asserting the token lands in the +// ENVIRON, not argv. +func captureCommandContext( + capture *[][]string, + cmds *[]*exec.Cmd, +) func(ctx context.Context, name string, args ...string) *exec.Cmd { + stub := stubCommandContext(0, "") + + return func(ctx context.Context, name string, args ...string) *exec.Cmd { + *capture = append(*capture, slices.Clone(args)) + + cmd := stub(ctx, name, args...) + *cmds = append(*cmds, cmd) + + return cmd + } +} + +// TestClone_TokenInjectionViaEnv pins the security-relevant invariant (F2): a +// non-empty token is injected via git's GIT_CONFIG_* ENV (git >=2.31), so the +// base64 credential is in the process environ — visible to git but NOT in argv +// (where `ps` would expose a `-c http.extraHeader=...` flag). The token must +// NEVER appear in the clone URL or anywhere as a plain argv value either. +func TestClone_TokenInjectionViaEnv(t *testing.T) { + t.Parallel() + + const ( + token = "TKN" + repoURL = "https://github.com/my-org/my-skills" + destDir = "/tmp/skillrig-dest" + ) + + var ( + captured [][]string + cmds []*exec.Cmd + ) + + c := &gitClient{commandContext: captureCommandContext(&captured, &cmds)} + + if err := c.Clone(context.Background(), repoURL, destDir, token); err != nil { + t.Fatalf("Clone: unexpected error from stubbed git: %v", err) + } + + if len(captured) != 1 || len(cmds) != 1 { + t.Fatalf("Clone issued %d/%d git invocations, want exactly 1", len(captured), len(cmds)) + } + + argv := captured[0] + + wantB64 := base64.StdEncoding.EncodeToString([]byte("x-access-token:" + token)) + + // (a) The base64 credential — and the raw token — must appear in NO argv + // value, and the token must not leak into the clone URL. + for i, a := range argv { + if strings.Contains(a, wantB64) { + t.Errorf("base64 credential leaked into argv[%d] = %q (must live only in the GIT_CONFIG env)", i, a) + } + + if strings.Contains(a, token) { + t.Errorf("raw token leaked into argv[%d] = %q (must live only in the GIT_CONFIG env)", i, a) + } + } + + // (b) The token rides in the process environ via the GIT_CONFIG_* triple, + // which run/runEnv set on the captured *exec.Cmd. + env := cmds[0].Env + + wantValue := "GIT_CONFIG_VALUE_0=Authorization: Basic " + wantB64 + + for _, want := range []string{"GIT_CONFIG_COUNT=1", "GIT_CONFIG_KEY_0=http.extraHeader", wantValue} { + if !slices.Contains(env, want) { + t.Errorf("cmd.Env is missing %q; the token must be injected via the GIT_CONFIG env, got:\n%v", want, env) + } + } +} diff --git a/pkg/skillcore/catalog.go b/pkg/skillcore/catalog.go new file mode 100644 index 0000000..8d45d6a --- /dev/null +++ b/pkg/skillcore/catalog.go @@ -0,0 +1,389 @@ +package skillcore + +import ( + "bytes" + "context" + "encoding/json" + "errors" + "fmt" + "os" + "path/filepath" + "slices" + "strings" + + "github.com/pelletier/go-toml/v2" +) + +// supportedConvention is the single origin convention version this binary +// speaks. The gate is EXACT-MATCH (C1): any other value — higher, lower, or an +// absent/zero field — is incompatible. A forward/backward-compat window is a +// deliberate future change, never an accident of a ">"-only check. +const supportedConvention = 1 + +// originConfigName is the origin's contract file, read by GenerateCatalog for +// the convention version and the skills directory. +const originConfigName = ".skillrig-origin.toml" + +// defaultSkillsDir is the origin's skills directory when .skillrig-origin.toml +// does not override it. +const defaultSkillsDir = "skills" + +// skillManifestName is the per-skill manifest file walked under the skills dir. +const skillManifestName = "SKILL.md" + +// Catalog is the origin's committed, machine-readable list of skills +// (index.json): the single-tip view produced by GenerateCatalog and consumed by +// Search. It is presentation-free — JSON in, JSON out, no human formatting. +type Catalog struct { + SkillrigConvention int `json:"skillrigConvention"` + Origin string `json:"origin"` + Skills []CatalogEntry `json:"skills"` +} + +// CatalogEntry is one skill's discovery record in the catalog: the searchable +// fields (name, description, topics) plus what add needs (version, namespace, +// path, requires). It carries no per-skill commit/treeSha — the catalog is +// discovery-only (D2). +type CatalogEntry struct { + Name string `json:"name"` + Version string `json:"version"` + Namespace string `json:"namespace"` + Description string `json:"description"` + Topics []string `json:"topics"` + Path string `json:"path"` + Requires []Require `json:"requires"` +} + +// originConfig is the subset of .skillrig-origin.toml GenerateCatalog reads: the +// convention version (carried into the catalog, never hardcoded — C7) and the +// skills directory to walk. +type originConfig struct { + ConventionVersion int `toml:"convention_version"` + Origin string `toml:"origin"` + SkillsDir string `toml:"skills_dir"` +} + +// catalogName is the origin's committed catalog file, fetched at the repo root. +const catalogName = "index.json" + +// CatalogRequest names where to fetch the origin's index.json from per call +// (FIX-4). It is the single catalog-fetch entry point both search and the add +// convention gate dispatch to (AP-04). The CLI classifies the origin form and +// fills it in; skillcore neither resolves the origin nor reads config. +type CatalogRequest struct { + // RepoURL is the git transport target — the LOCAL origin's file:// or + // the remote GitHub HTTPS URL (config.Origin.CloneURL produces it). + RepoURL string + // Origin is the OWNER/REPO[@REF] (remote) or path (local) identity, used only + // for error reporting so a fetch failure names the configured origin. + Origin string + // Ref is the git ref to read index.json at: the origin's @ref branch, else + // HEAD. A bare empty Ref defaults to HEAD. + Ref string + // Local marks RepoURL as a file:// (local) target: no GitHub token is + // resolved, and failures are never the remote auth/private-not-found classes. + Local bool +} + +// FetchCatalog fetches the origin's index.json at the request's ref and parses it +// into a Catalog (FIX-4). It is fetched PER CALL (no cache) through the one git +// transport (FetchFile) for both the remote and file:// local forms, so search +// and the add convention gate share exactly one catalog acquisition path +// (AP-04). The convention version is NOT gated here — callers run CheckConvention +// against the returned Catalog.SkillrigConvention so parse and policy stay +// separable. A git failure is classified into the renderable typed errors +// (Auth/Unreachable/NotFound) the CLI branches on, anchored on the configured +// origin identity. +func FetchCatalog(ctx context.Context, req CatalogRequest) (Catalog, error) { + ref := req.Ref + if ref == "" { + ref = "HEAD" + } + + var token string + if !req.Local { + token, _ = ResolveGitHubToken(defaultGitHubHost) + } + + data, err := FetchFile(ctx, req.RepoURL, catalogName, ref, token) + if err != nil { + return Catalog{}, classifyCatalogError(req, err) + } + + return ParseCatalog(data) +} + +// classifyCatalogError maps a catalog-fetch git failure onto the renderable typed +// errors, anchoring each on the configured origin identity (the catalog has no +// per-skill identity, so Skill is empty and a NotFound reads as " not +// found"). A non-*GitError (e.g. a parse error) is returned unchanged. +func classifyCatalogError(req CatalogRequest, err error) error { + var gitErr *GitError + if !errors.As(err, &gitErr) { + return err + } + + classified := ClassifyGitError(gitErr) + + var ( + authErr *AuthError + unreach *UnreachableError + notFound *NotFoundError + ) + + switch { + case errors.As(classified, &authErr): + return &AuthError{Origin: req.Origin, Cause: gitErr} + case errors.As(classified, &unreach): + return &UnreachableError{Origin: req.Origin, Cause: gitErr} + case errors.As(classified, ¬Found): + return &NotFoundError{Origin: req.Origin, Cause: gitErr} + default: + return classified + } +} + +// ParseCatalog decodes index.json bytes into a Catalog. It does not gate the +// convention version — callers run CheckConvention against the decoded +// SkillrigConvention so the parse and the policy stay separable. +func ParseCatalog(data []byte) (Catalog, error) { + var c Catalog + if err := json.Unmarshal(data, &c); err != nil { + return Catalog{}, fmt.Errorf("parsing catalog: %w", err) + } + + return c, nil +} + +// CheckConvention enforces the exact-match convention gate (C1): it returns nil +// only when n is exactly supportedConvention, otherwise an +// *IncompatibleConventionError. A higher version, a lower version, and an +// absent/zero field all fail — there is no compatibility window this slice. +func CheckConvention(n int) error { + if n == supportedConvention { + return nil + } + + return &IncompatibleConventionError{Found: n, Supported: supportedConvention} +} + +// GenerateCatalog walks the origin checkout at originRoot and produces its +// index.json bytes. It reads the convention version and skills directory from +// the origin's .skillrig-origin.toml (the convention is carried, not +// hardcoded — C7), parses each /*/SKILL.md via ParseManifest (the +// ONE manifest parser, AP-04), projects them into CatalogEntry, sorts by name +// for determinism (SC-009), and marshals with stable key order plus a trailing +// newline. The returned bytes are byte-identical across runs over an unchanged +// skill set. +func GenerateCatalog(originRoot string) ([]byte, error) { + cfg, err := readOriginConfig(originRoot) + if err != nil { + return nil, err + } + + skillsDir := cfg.SkillsDir + if skillsDir == "" { + skillsDir = defaultSkillsDir + } + + entries, err := collectEntries(originRoot, skillsDir) + if err != nil { + return nil, err + } + + slices.SortFunc(entries, func(a, b CatalogEntry) int { + return strings.Compare(a.Name, b.Name) + }) + + catalog := Catalog{ + SkillrigConvention: cfg.ConventionVersion, + Origin: cfg.Origin, + Skills: entries, + } + + // Encode with HTML-escaping OFF so version constraints stay readable + // (">=1.6", not ">=1.6"). json.MarshalIndent always HTML-escapes; a + // json.Encoder with SetEscapeHTML(false) is the only way to suppress it. + // json.Encoder already appends a trailing newline. + var buf bytes.Buffer + + enc := json.NewEncoder(&buf) + enc.SetEscapeHTML(false) + enc.SetIndent("", " ") + + if err := enc.Encode(catalog); err != nil { + return nil, fmt.Errorf("encoding catalog: %w", err) + } + + return buf.Bytes(), nil +} + +// readOriginConfig parses the origin's .skillrig-origin.toml at originRoot. +func readOriginConfig(originRoot string) (originConfig, error) { + path := filepath.Join(originRoot, originConfigName) + + //nolint:gosec // G304: path is originRoot + a fixed file name, not attacker-controlled. + data, err := os.ReadFile(path) + if err != nil { + return originConfig{}, fmt.Errorf("reading origin config %q: %w", path, err) + } + + var cfg originConfig + if err := toml.Unmarshal(data, &cfg); err != nil { + return originConfig{}, fmt.Errorf("parsing origin config %q: %w", path, err) + } + + return cfg, nil +} + +// collectEntries walks //*/SKILL.md, parses each via +// ParseManifest, and projects it into a CatalogEntry whose path is the skill +// directory relative to originRoot. +func collectEntries(originRoot, skillsDir string) ([]CatalogEntry, error) { + root := filepath.Join(originRoot, skillsDir) + + dirEntries, err := os.ReadDir(root) + if err != nil { + return nil, fmt.Errorf("reading skills dir %q: %w", root, err) + } + + entries := make([]CatalogEntry, 0, len(dirEntries)) + for _, de := range dirEntries { + if !de.IsDir() { + continue + } + + manifestPath := filepath.Join(root, de.Name(), skillManifestName) + if _, err := os.Stat(manifestPath); err != nil { + continue // a directory without a SKILL.md is not a skill + } + + m, err := ParseManifest(manifestPath) + if err != nil { + return nil, err + } + + entries = append(entries, CatalogEntry{ + Name: m.Name, + Version: m.Version, + Namespace: m.Namespace, + Description: m.Description, + Topics: m.Topics, + Path: filepath.ToSlash(filepath.Join(skillsDir, de.Name())), + Requires: m.Requires, + }) + } + + return entries, nil +} + +// relevance bucket scores for search ordering (D8): a higher bucket sorts first, +// ties broken lexicographically by name. +const ( + bucketExactName = 3 // query (joined) equals the name exactly + bucketNameHit = 2 // a query term is a substring of the name + bucketTopicHit = 1 // a query term is a substring of a topic + bucketDescription = 0 // matched on description only +) + +// Search filters catalog deterministically (D8, N6) and returns the matching +// entries ordered by relevance bucket then name. A query term matches when it is +// a substring of lower(name+" "+description+" "+join(topics)) (token-AND: every +// term must match); a requested topic must be present (case-insensitive exact +// membership, AND across topics). Empty query and no topics lists everything. +// Stdlib only — no fuzzy/semantic/learned ranking. +func Search(catalog Catalog, query, topics []string) []CatalogEntry { + lowerQuery := lowerAll(query) + lowerTopics := lowerAll(topics) + + matched := make([]CatalogEntry, 0, len(catalog.Skills)) + for _, entry := range catalog.Skills { + if matchesEntry(entry, lowerQuery, lowerTopics) { + matched = append(matched, entry) + } + } + + slices.SortFunc(matched, func(a, b CatalogEntry) int { + ba := relevance(a, lowerQuery) + bb := relevance(b, lowerQuery) + + if ba != bb { + return bb - ba // higher bucket first + } + + return strings.Compare(a.Name, b.Name) + }) + + return matched +} + +// matchesEntry reports whether entry satisfies every lowercased query term +// (token-AND substring over name+description+topics) and carries every requested +// topic (case-insensitive exact membership). +func matchesEntry(entry CatalogEntry, query, topics []string) bool { + haystack := strings.ToLower( + entry.Name + " " + entry.Description + " " + strings.Join(entry.Topics, " "), + ) + + for _, term := range query { + if !strings.Contains(haystack, term) { + return false + } + } + + for _, want := range topics { + if !hasTopic(entry.Topics, want) { + return false + } + } + + return true +} + +// hasTopic reports whether topics contains want, comparing case-insensitively. +func hasTopic(topics []string, want string) bool { + for _, t := range topics { + if strings.EqualFold(t, want) { + return true + } + } + + return false +} + +// relevance computes entry's ordering bucket for the given lowercased query: an +// exact name match outranks a name substring, which outranks a topic substring, +// which outranks a description-only match. +func relevance(entry CatalogEntry, query []string) int { + lowerName := strings.ToLower(entry.Name) + + if len(query) > 0 && strings.Join(query, " ") == lowerName { + return bucketExactName + } + + for _, term := range query { + if strings.Contains(lowerName, term) { + return bucketNameHit + } + } + + for _, term := range query { + for _, topic := range entry.Topics { + if strings.Contains(strings.ToLower(topic), term) { + return bucketTopicHit + } + } + } + + return bucketDescription +} + +// lowerAll returns a lowercased copy of in. +func lowerAll(in []string) []string { + out := make([]string, len(in)) + for i, s := range in { + out[i] = strings.ToLower(s) + } + + return out +} diff --git a/pkg/skillcore/catalog_test.go b/pkg/skillcore/catalog_test.go new file mode 100644 index 0000000..0056476 --- /dev/null +++ b/pkg/skillcore/catalog_test.go @@ -0,0 +1,691 @@ +package skillcore + +import ( + "errors" + "path/filepath" + "reflect" + "strconv" + "testing" +) + +// writeOriginFixture lays down a minimal origin checkout under a fresh tmpDir: a +// .skillrig-origin.toml carrying the convention/origin/skills_dir, plus a +// skills//SKILL.md for each (name, frontmatter) pair. It does NOT git-init +// — GenerateCatalog reads the filesystem, not git, so a working tree is enough +// (the tree-SHA oracle elsewhere uses raw git; the catalog never hashes). +func writeOriginFixture(t *testing.T, originToml string, skills map[string]string) string { + t.Helper() + + dir := t.TempDir() + writeFile(t, dir, ".skillrig-origin.toml", 0o644, originToml) + + for name, skillMd := range skills { + writeFile(t, dir, filepath.Join("skills", name, "SKILL.md"), 0o644, skillMd) + } + + return dir +} + +// goldenOriginToml is the origin contract the golden-fixture test builds against. +const goldenOriginToml = `convention_version = 1 +origin = "my-org/my-skills" +skills_dir = "skills" +` + +// goldenAlphaSkillMd / goldenBetaSkillMd are the two skills the golden catalog is +// generated from: alpha carries topics + a single requires entry, beta omits +// requires (so the golden pins the requires: null projection too). +const goldenAlphaSkillMd = `--- +name: alpha +description: Alpha skill. +metadata: + x-skillrig.namespace: my-org + x-skillrig.version: 1.0.0 + x-skillrig.topics: [aws, platform] + x-skillrig.requires: + - tool: terraform + version: ">=1.6" + source: hashicorp/terraform + manager: mise +--- +# alpha +` + +const goldenBetaSkillMd = `--- +name: beta +description: Beta skill for review. +metadata: + x-skillrig.namespace: my-org + x-skillrig.version: 2.1.0 + x-skillrig.topics: [review] +--- +# beta +` + +// goldenIndexJSON is the committed ground-truth catalog (SC-009 / D2): the exact +// bytes GenerateCatalog must emit over goldenAlphaSkillMd + goldenBetaSkillMd. +// Entries are sorted by name; a trailing newline is appended; Require fields +// serialize with LOWERCASE keys (tool/version/source/manager) from the type's +// json tags (FIX-5, data-model §2) — the golden documents the producer's actual +// on-disk artifact, and its lowercase keys keep the bug from re-hiding. +const goldenIndexJSON = `{ + "skillrigConvention": 1, + "origin": "my-org/my-skills", + "skills": [ + { + "name": "alpha", + "version": "1.0.0", + "namespace": "my-org", + "description": "Alpha skill.", + "topics": [ + "aws", + "platform" + ], + "path": "skills/alpha", + "requires": [ + { + "tool": "terraform", + "version": ">=1.6", + "source": "hashicorp/terraform", + "manager": "mise" + } + ] + }, + { + "name": "beta", + "version": "2.1.0", + "namespace": "my-org", + "description": "Beta skill for review.", + "topics": [ + "review" + ], + "path": "skills/beta", + "requires": null + } + ] +} +` + +// TestGenerateCatalog_EqualsGoldenFixture is the producer==artifact ground-truth +// oracle (D2 contract test, SC-009): GenerateCatalog over a fixed skill set must +// be BYTE-identical to the committed golden index.json — sorted by name, trailing +// newline, convention carried from .skillrig-origin.toml (not hardcoded — C7). A +// diff here means the catalog format drifted under search/add's feet. +func TestGenerateCatalog_EqualsGoldenFixture(t *testing.T) { + t.Parallel() + + dir := writeOriginFixture(t, goldenOriginToml, map[string]string{ + // Insertion order is intentionally NOT lexicographic so the test also + // proves GenerateCatalog sorts rather than echoing directory order. (Go + // map ranging is randomized anyway; ReadDir returns sorted, but writing + // beta-then-alpha here documents the intent.) + "beta": goldenBetaSkillMd, + "alpha": goldenAlphaSkillMd, + }) + + got, err := GenerateCatalog(dir) + if err != nil { + t.Fatalf("GenerateCatalog: unexpected error: %v", err) + } + + if string(got) != goldenIndexJSON { + t.Errorf("GenerateCatalog output != golden fixture\n--- got ---\n%s\n--- want ---\n%s", + got, goldenIndexJSON) + } +} + +// TestGenerateCatalog_Deterministic asserts the determinism contract (SC-009): +// two runs over an unchanged skill set yield byte-identical output (no map-order +// or walk-order nondeterminism leaks into the artifact). +func TestGenerateCatalog_Deterministic(t *testing.T) { + t.Parallel() + + dir := writeOriginFixture(t, goldenOriginToml, map[string]string{ + "alpha": goldenAlphaSkillMd, + "beta": goldenBetaSkillMd, + }) + + first, err := GenerateCatalog(dir) + if err != nil { + t.Fatalf("GenerateCatalog (first): %v", err) + } + + second, err := GenerateCatalog(dir) + if err != nil { + t.Fatalf("GenerateCatalog (second): %v", err) + } + + if string(first) != string(second) { + t.Errorf("GenerateCatalog not deterministic across runs:\nfirst:\n%s\nsecond:\n%s", + first, second) + } +} + +// TestCheckConvention_Boundary pins the exact-match convention gate (C1): only the +// single supported value passes; the immediately-adjacent values (one below, one +// above) and the absent/zero field all fail with *IncompatibleConventionError +// carrying the read value. This is the boundary the brief calls out (0, absent, +// and 2 give the error, 1 is ok) — encoded as 0/1/2 plus the explicit absent +// case (a missing skillrigConvention decodes to 0). +func TestCheckConvention_Boundary(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + n int + wantErr bool + }{ + {name: "zero (absent field) fails", n: 0, wantErr: true}, + {name: "exactly one is ok", n: 1, wantErr: false}, + {name: "two (one above) fails", n: 2, wantErr: true}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + err := CheckConvention(tt.n) + + if !tt.wantErr { + if err != nil { + t.Fatalf("CheckConvention(%d) = %v, want nil", tt.n, err) + } + + return + } + + var convErr *IncompatibleConventionError + if !errors.As(err, &convErr) { + t.Fatalf("CheckConvention(%d) error = %T (%v), want *IncompatibleConventionError", + tt.n, err, err) + } + + if convErr.Found != tt.n { + t.Errorf("IncompatibleConventionError.Found = %d, want %d", convErr.Found, tt.n) + } + + if convErr.Supported != supportedConvention { + t.Errorf("IncompatibleConventionError.Supported = %d, want %d", + convErr.Supported, supportedConvention) + } + }) + } +} + +// TestSearch_OrderingAndDeterminism is the table-driven matcher contract (D8): +// query → the EXACT ordered list of matching names. It pins the relevance +// buckets (exact-name > name-substring > topic-hit > description-only) with the +// lexicographic-by-name tiebreak, the token-AND substring rule, the case +// insensitivity, the --topic membership filter, and the empty-query "list all" +// and no-match "empty" edges. Exact order is asserted — not set membership — so a +// ranking regression is caught. +func TestSearch_OrderingAndDeterminism(t *testing.T) { + t.Parallel() + + // A fixed catalog exercising every bucket. Names chosen so the lexicographic + // tiebreak is observable within a bucket (e.g. two name-substring hits). + catalog := Catalog{ + SkillrigConvention: 1, + Origin: "my-org/my-skills", + Skills: []CatalogEntry{ + { + Name: "terraform", + Description: "infra as code", + Topics: []string{"infra"}, + }, + { + Name: "terraform-plan-review", + Description: "Review a terraform plan", + Topics: []string{"platform", "aws"}, + }, + { + Name: "drift-detector", + Description: "detect terraform drift across stacks", + Topics: []string{"terraform-tooling"}, + }, + { + Name: "cost-estimator", + Description: "estimate spend before apply", + Topics: []string{"finops", "aws"}, + }, + { + Name: "terraform-fmt", + Description: "format HCL", + Topics: []string{"style"}, + }, + }, + } + + tests := []struct { + name string + query []string + topics []string + want []string + }{ + { + name: "empty query lists all by name", + query: nil, + want: []string{ + "cost-estimator", + "drift-detector", + "terraform", + "terraform-fmt", + "terraform-plan-review", + }, + }, + { + name: "relevance buckets then lexicographic name", + query: []string{"terraform"}, + // exact-name(3): terraform + // name-substring(2): terraform-fmt, terraform-plan-review (lex) + // topic-hit(1): drift-detector (topic "terraform-tooling") + // description-only(0): cost-estimator does NOT match (no "terraform") + want: []string{ + "terraform", + "terraform-fmt", + "terraform-plan-review", + "drift-detector", + }, + }, + { + name: "case-insensitive substring match", + query: []string{"TERRAFORM"}, + want: []string{ + "terraform", + "terraform-fmt", + "terraform-plan-review", + "drift-detector", + }, + }, + { + name: "token-AND requires every term to match", + query: []string{"terraform", "review"}, + // only terraform-plan-review has both "terraform" and "review". + want: []string{"terraform-plan-review"}, + }, + { + name: "topic filter (case-insensitive membership)", + query: nil, + topics: []string{"AWS"}, + want: []string{"cost-estimator", "terraform-plan-review"}, + }, + { + name: "query and topic combine (AND)", + query: []string{"terraform"}, + topics: []string{"aws"}, + want: []string{"terraform-plan-review"}, + }, + { + name: "no match yields empty", + query: []string{"kubernetes"}, + want: []string{}, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + got := namesOf(Search(catalog, tt.query, tt.topics)) + + if !reflect.DeepEqual(got, tt.want) { + t.Errorf("Search(%v, topics=%v) order =\n %v\nwant\n %v", + tt.query, tt.topics, got, tt.want) + } + + // Determinism: a second call over the same inputs is identical. + again := namesOf(Search(catalog, tt.query, tt.topics)) + if !reflect.DeepEqual(again, got) { + t.Errorf("Search not deterministic: first %v, second %v", got, again) + } + }) + } +} + +// namesOf projects entries to their names, preserving order. It returns a +// non-nil empty slice for an empty input so reflect.DeepEqual against a +// []string{} literal (the no-match expectation) holds. +func namesOf(entries []CatalogEntry) []string { + out := make([]string, 0, len(entries)) + for _, e := range entries { + out = append(out, e.Name) + } + + return out +} + +// TestClassifyGitError_StderrToTyped is the stderr→typed-error classification +// table (D4/D6): each representative git/gh stderr (all exit 128) must map to its +// network-class typed error, an unrecognized stderr must pass the raw *GitError +// through unchanged, and a nil input must stay nil. The classifier is the single +// place this mapping lives (skillcore, never cli); the typed errors wrap the +// original so --verbose still reaches the raw git output. +func TestClassifyGitError_StderrToTyped(t *testing.T) { + t.Parallel() + + t.Run("nil passes through as nil", func(t *testing.T) { + t.Parallel() + + if err := ClassifyGitError(nil); err != nil { + t.Fatalf("ClassifyGitError(nil) = %v, want nil", err) + } + }) + + tests := []struct { + name string + stderr string + // want is the typed class the stderr must classify into. classRaw means + // "left as the bare *GitError" (no network class matched). + want errClass + }{ + { + name: "authentication failed -> AuthError", + stderr: "remote: Authentication failed for 'https://github.com/my-org/my-skills/'", + want: classAuth, + }, + { + name: "invalid username or token -> AuthError", + stderr: "fatal: Invalid username or token. Password authentication is not supported.", + want: classAuth, + }, + { + name: "could not resolve host -> UnreachableError", + stderr: "fatal: unable to access 'https://github.com/...': Could not resolve host: github.com", + want: classUnreachable, + }, + { + name: "failed to connect -> UnreachableError", + stderr: "fatal: unable to access '...': Failed to connect to github.com port 443", + want: classUnreachable, + }, + { + name: "repository not found (URL form) -> NotFoundError", + stderr: "fatal: repository 'https://github.com/my-org/missing/' not found", + want: classNotFound, + }, + { + name: "Remote: Repository not found (capitalized) -> NotFoundError", + stderr: "remote: Repository not found.\nfatal: repository '...' not found", + want: classNotFound, + }, + { + name: "unrecognized stderr passes the raw *GitError through", + stderr: "fatal: early EOF", + want: classRaw, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + raw := &GitError{ExitCode: 128, Stderr: tt.stderr} + + out := ClassifyGitError(raw) + + if got := classOf(out); got != tt.want { + t.Errorf("ClassifyGitError(%q) class = %v, want %v", tt.stderr, got, tt.want) + } + + // Every classified error must still unwrap to the raw *GitError so + // --verbose can reach the original stderr (the raw case IS the + // *GitError; the network cases wrap it). + var ge *GitError + if !errors.As(out, &ge) { + t.Fatalf("classified error %v does not unwrap to *GitError", out) + } + + if ge.Stderr != tt.stderr { + t.Errorf("unwrapped stderr = %q, want %q", ge.Stderr, tt.stderr) + } + }) + } +} + +// errClass names the typed failure ClassifyGitError can produce, so the table can +// assert the matched class declaratively (one classOf call) instead of a per-case +// errors.As closure — keeping the test's branching flat. +type errClass int + +const ( + classRaw errClass = iota // left as the bare *GitError (no network class) + classAuth + classUnreachable + classNotFound +) + +func (c errClass) String() string { + switch c { + case classAuth: + return "AuthError" + case classUnreachable: + return "UnreachableError" + case classNotFound: + return "NotFoundError" + case classRaw: + return "raw *GitError" + default: + return "unknown" + } +} + +// classOf reports which network class err was classified into; classRaw means no +// network class matched (it is still the bare *GitError). The most specific match +// wins — the network types are checked before falling through to raw. +func classOf(err error) errClass { + var ( + a *AuthError + u *UnreachableError + n *NotFoundError + ) + + switch { + case errors.As(err, &a): + return classAuth + case errors.As(err, &u): + return classUnreachable + case errors.As(err, &n): + return classNotFound + default: + return classRaw + } +} + +// ghStub is the behavior a fake `gh` binary impersonates for one case: print +// stdout, then exit with exitCode. absent==true installs NO gh on PATH (the +// fallback tier is skipped entirely). +type ghStub struct { + absent bool + stdout string + exitCode int +} + +// installGhStub points PATH at a fresh dir for this case. When stub.absent it +// leaves the dir empty (exec.LookPath("gh") fails → fallback skipped); otherwise +// it writes an executable `gh` shell script that emits stub.stdout and exits with +// stub.exitCode. Returning a fresh dir per case (no inherited PATH) keeps the +// fallback hermetic — a real gh on the developer's PATH can never leak in. +func installGhStub(t *testing.T, stub ghStub) { + t.Helper() + + binDir := t.TempDir() + + if !stub.absent { + // A leading newline in stdout proves the resolver TrimSpaces the output. + script := "#!/bin/sh\nprintf '%s' \"" + stub.stdout + "\"\nexit " + + strconv.Itoa(stub.exitCode) + "\n" + writeFile(t, binDir, "gh", 0o755, script) + } + + // Replace PATH entirely (not prepend) so no ambient gh is reachable. + t.Setenv("PATH", binDir) +} + +// TestResolveGitHubToken_Precedence pins the resolver order (D4): GH_TOKEN env > +// GITHUB_TOKEN env > `gh auth token`. The env tiers are exercised directly +// (whitespace-only is "unset"); the gh-fallback tier is exercised by impersonating +// the gh binary on a scrubbed PATH (a stub shell script), so the test never +// touches a real gh session. Each case clears both env vars first so the +// precedence is unambiguous. +// +// This test mutates process env (GH_TOKEN/GITHUB_TOKEN/PATH), so it does NOT run +// in parallel — it would race other tests reading those globals. +// +//nolint:tparallel // mutates os.Environ (GH_TOKEN/GITHUB_TOKEN/PATH); must run serially. +func TestResolveGitHubToken_Precedence(t *testing.T) { + tests := []struct { + name string + // env tiers ("" + ok=false means the var is left unset) + ghToken string + ghTokenOK bool + ghubToken string + ghubOK bool + // gh fallback impersonation + gh ghStub + + wantToken string + wantOK bool + }{ + { + name: "GH_TOKEN wins over GITHUB_TOKEN and gh", + ghToken: "gh-env-token", + ghTokenOK: true, + ghubToken: "github-env-token", + ghubOK: true, + gh: ghStub{stdout: "gh-cli-token"}, + wantToken: "gh-env-token", + wantOK: true, + }, + { + name: "GITHUB_TOKEN used when GH_TOKEN unset", + ghubToken: "github-env-token", + ghubOK: true, + gh: ghStub{stdout: "gh-cli-token"}, + wantToken: "github-env-token", + wantOK: true, + }, + { + name: "whitespace-only GH_TOKEN is treated as unset", + ghToken: " ", + ghTokenOK: true, + ghubToken: "github-env-token", + ghubOK: true, + wantToken: "github-env-token", + wantOK: true, + }, + { + name: "falls back to gh auth token when both env unset", + gh: ghStub{stdout: "\ngh-cli-token\n"}, // leading/trailing ws trimmed + wantToken: "gh-cli-token", + wantOK: true, + }, + { + name: "gh absent from PATH -> (\"\", false)", + gh: ghStub{absent: true}, + wantToken: "", + wantOK: false, + }, + { + name: "gh present but non-zero exit -> (\"\", false)", + gh: ghStub{stdout: "stale-token", exitCode: 1}, + wantToken: "", + wantOK: false, + }, + { + name: "gh exit 0 but empty/whitespace stdout -> (\"\", false)", + gh: ghStub{stdout: " \n"}, + wantToken: "", + wantOK: false, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + // t.Setenv auto-restores. Setting an unwanted tier to "" is + // equivalent to unsetting it: the resolver treats an empty (or + // whitespace-only) value as "unset", so the precedence under test is + // unambiguous without touching os.Unsetenv. + ghEnv := "" + if tt.ghTokenOK { + ghEnv = tt.ghToken + } + + t.Setenv("GH_TOKEN", ghEnv) + + ghubEnv := "" + if tt.ghubOK { + ghubEnv = tt.ghubToken + } + + t.Setenv("GITHUB_TOKEN", ghubEnv) + + installGhStub(t, tt.gh) + + gotToken, gotOK := ResolveGitHubToken("github.com") + + if gotToken != tt.wantToken || gotOK != tt.wantOK { + t.Errorf("ResolveGitHubToken() = (%q, %v), want (%q, %v)", + gotToken, gotOK, tt.wantToken, tt.wantOK) + } + }) + } +} + +// TestManifest_MetadataXSkillrigKeys focuses the parse contract on the +// metadata.x-skillrig.* extension lift (D1): every namespaced key — namespace, +// version, convention-version, topics, and the nested requires list — is hoisted +// onto the flat Manifest, while standard frontmatter (name, description) is read +// verbatim and unknown keys (top-level and inside metadata) are ignored. This is +// the catalog/add data source, so the lift is asserted field-by-field. +func TestManifest_MetadataXSkillrigKeys(t *testing.T) { + t.Parallel() + + const skillMd = `--- +name: alpha +description: An alpha skill. +license: MIT +allowed-tools: Bash(git:*) Read +metadata: + x-skillrig.namespace: my-org + x-skillrig.version: 1.4.0 + x-skillrig.convention-version: "1" + x-skillrig.topics: [platform-team, terraform, aws] + x-skillrig.unknown-extension: ignored + x-skillrig.requires: + - tool: oxid + version: ">=0.4.0" + source: my-org/my-skills + manager: mise +--- + +# alpha +` + + path := writeSkillMd(t, "alpha", skillMd) + + got, err := ParseManifest(path) + if err != nil { + t.Fatalf("ParseManifest: %v", err) + } + + want := Manifest{ + Name: "alpha", + Description: "An alpha skill.", + Namespace: "my-org", + Version: "1.4.0", + Convention: "1", + Topics: []string{"platform-team", "terraform", "aws"}, + Requires: []Require{ + { + Tool: "oxid", + Version: ">=0.4.0", + Source: "my-org/my-skills", + Manager: "mise", + }, + }, + } + + if !reflect.DeepEqual(got, want) { + t.Errorf("ParseManifest metadata.x-skillrig lift mismatch:\n got = %+v\nwant = %+v", got, want) + } +} diff --git a/pkg/skillcore/errors.go b/pkg/skillcore/errors.go index 1688b7c..843f803 100644 --- a/pkg/skillcore/errors.go +++ b/pkg/skillcore/errors.go @@ -3,6 +3,7 @@ package skillcore import ( "fmt" "strconv" + "strings" ) // VerifyFailure is returned by Verify when at least one verdict is not ok. It @@ -91,3 +92,166 @@ type GitError struct { func (e *GitError) Error() string { return fmt.Sprintf("git failed (exit %d): %s", e.ExitCode, e.Stderr) } + +// AuthError is returned when a remote fetch fails authentication against the +// origin — git reported "Authentication failed" or "Invalid username or token". +// It is distinct from *NotFoundError (which GitHub returns for a private repo +// when NO token was resolved): an AuthError means a credential WAS presented but +// rejected. Origin is the OWNER/REPO[@REF] reference being reached. +// Presentation-free: the CLI renders the what/why/fix prose. Cause carries the +// raw *GitError, surfaced under --verbose. +type AuthError struct { + Origin string + Cause error +} + +func (e *AuthError) Error() string { + return fmt.Sprintf("authentication failed reaching %q", e.Origin) +} + +func (e *AuthError) Unwrap() error { + return e.Cause +} + +// UnreachableError is returned when a remote fetch cannot reach the origin host +// — git reported "Could not resolve host" or "Failed to connect". It signals a +// network/connectivity problem (or a misspelled origin), as opposed to an +// auth/permission or missing-repo failure. Origin is the OWNER/REPO[@REF] +// reference. Presentation-free; Cause carries the raw *GitError for --verbose. +type UnreachableError struct { + Origin string + Cause error +} + +func (e *UnreachableError) Error() string { + return fmt.Sprintf("could not reach %q", e.Origin) +} + +func (e *UnreachableError) Unwrap() error { + return e.Cause +} + +// NotFoundError is returned when the origin repository (or a requested skill +// subtree within it) does not exist — git reported "repository not found". It is +// deliberately distinct from *SkillNotFoundError (the local-path origin IS +// checked out but lacks the skill) and from *NoSuchVersionError (the skill +// exists but the pinned version does not). The GitHub subtlety: a private repo +// reached without a resolved token also reports "not found", so Authenticated +// records whether a token was presented — the CLI adds an "if private, +// authenticate" hint when it was not. Presentation-free; Cause carries the raw +// *GitError for --verbose. +type NotFoundError struct { + Origin string + Skill string + Authenticated bool + Cause error +} + +func (e *NotFoundError) Error() string { + if e.Skill != "" { + return fmt.Sprintf("%q not found in %q", e.Skill, e.Origin) + } + + return fmt.Sprintf("%q not found", e.Origin) +} + +func (e *NotFoundError) Unwrap() error { + return e.Cause +} + +// NoSuchVersionError is returned when a --pin reference resolves to no existing +// tag or commit in the origin — the skill exists, but the requested version does +// not. It is a distinct type (not *NotFoundError) so callers/CI can branch on a +// bad pin versus a missing skill rather than on prose (C2). Ref is the +// unresolved pin as the user supplied it. Presentation-free; Cause carries the +// raw *GitError for --verbose. +type NoSuchVersionError struct { + Skill string + Ref string + Cause error +} + +func (e *NoSuchVersionError) Error() string { + return fmt.Sprintf("%q has no version %q", e.Skill, e.Ref) +} + +func (e *NoSuchVersionError) Unwrap() error { + return e.Cause +} + +// IncompatibleConventionError is returned when the origin's catalog declares a +// skillrigConvention this binary does not support. The gate is exact-match (C1): +// only convention 1 is accepted, so any other value — a higher version, a lower +// version, or an absent/zero field — is incompatible. Found is the value read +// from the origin; Supported is the single version this tool implements. +// Presentation-free: the CLI renders the upgrade/mismatch guidance. +type IncompatibleConventionError struct { + Found int + Supported int +} + +func (e *IncompatibleConventionError) Error() string { + return fmt.Sprintf( + "origin uses convention v%d (this tool supports exactly v%d)", + e.Found, + e.Supported, + ) +} + +// ClassifyGitError maps a raw *GitError from a remote fetch onto a typed, +// renderable failure by matching its captured stderr. The three network classes +// are the only ones classified here; an unrecognized stderr returns the original +// *GitError unchanged (the CLI surfaces it raw under --verbose). Classification +// lives in skillcore (the fetch layer), never in internal/cli — the prose +// what/why/fix is the CLI's job, the failure CLASS is skillcore's. The returned +// typed errors wrap err so --verbose can still reach the raw git output. +func ClassifyGitError(err *GitError) error { + if err == nil { + return nil + } + + stderr := err.Stderr + + switch { + case containsAny(stderr, "Authentication failed", "Invalid username or token"): + return &AuthError{Cause: err} + case containsAny( + stderr, + "Could not resolve host", + "Failed to connect", + // Local (file://-or-path) origins that are missing or not a repo fail + // with these anchors instead of a host/connect error; neither contains + // "not found", so isRepoNotFound is unaffected and auth is matched above. + "Could not read from remote repository", + "does not appear to be a git repository", + ): + return &UnreachableError{Cause: err} + case isRepoNotFound(stderr): + return &NotFoundError{Cause: err} + default: + return err + } +} + +// isRepoNotFound reports whether git stderr signals a missing repository. git +// emits the URL between "repository" and "not found" +// (e.g. `repository 'https://…/' not found`) and a separate, capitalized +// "Remote: Repository not found." line — so a literal "repository not found" +// substring matches neither. Match the lowercased text on both anchors instead. +func isRepoNotFound(stderr string) bool { + lower := strings.ToLower(stderr) + + return strings.Contains(lower, "repository") && + strings.Contains(lower, "not found") +} + +// containsAny reports whether s contains any of the given substrings. +func containsAny(s string, subs ...string) bool { + for _, sub := range subs { + if strings.Contains(s, sub) { + return true + } + } + + return false +} diff --git a/pkg/skillcore/fetch.go b/pkg/skillcore/fetch.go new file mode 100644 index 0000000..5fd2a6a --- /dev/null +++ b/pkg/skillcore/fetch.go @@ -0,0 +1,202 @@ +package skillcore + +import ( + "context" + "errors" +) + +// defaultGitHubHost is the host skillrig fetches origins from. It is the seam +// for GitHub Enterprise (research D4): ResolveGitHubToken and the clone URL both +// thread a hostname today fixed to github.com. +const defaultGitHubHost = "github.com" + +// FetchRequest names a single skill subtree to fetch from a remote origin. The +// caller (the CLI add path) supplies the already-classified remote coordinates; +// skillcore neither resolves the origin nor reads config — it owns only the git +// transport and the failure classification (AP-06). +type FetchRequest struct { + // Owner and Repo are the OWNER/REPO halves of a REMOTE origin reference. They + // are used for error reporting (originRef) and, when RepoURL is empty, to + // derive the GitHub HTTPS clone URL. They are empty for a LOCAL origin. + Owner string + Repo string + // RepoURL is the git transport target to clone from when set — the LOCAL + // origin's file:// (FIX-1, the file:// test substrate AND FR-011) or any + // caller-supplied URL. Empty means "derive https://github.com/Owner/Repo.git". + // The seam that lets a local-path/file:// origin be fetched without hardcoding + // github.com (config.Origin.CloneURL produces this value). + RepoURL string + // Local marks RepoURL as a file:// (local) target. A local origin needs no + // GitHub token and its failures are never the remote auth/private-not-found + // classes, so the token is not resolved for it. + Local bool + // Skill is the skill directory name, used in error reporting so the CLI can + // distinguish a missing skill from a missing origin (errors-as-navigation). + Skill string + // SkillPath is the repo-relative path of the skill subtree to sparse-check-out + // (the catalog's path, or the conventional skills/). + SkillPath string + // Ref is the git ref to check out: the resolved --pin tag/SHA when pinning, + // else the origin's @ref branch (D7). + Ref string + // Pinned marks Ref as a user-supplied --pin. A failed checkout of a pin means + // the requested version does not exist (NoSuchVersionError, C2), distinct from + // a branch ref that points at a real, fetchable tip. + Pinned bool +} + +// FetchResult is the outcome of a successful FetchSkill: the local temp dir the +// subtree was checked out into (the caller's to remove) and the resolved upstream +// commit (provenance for the lock entry, D5). +type FetchResult struct { + Dir string + Commit string +} + +// originRef renders the origin identity for error reporting — the +// human-meaningful anchor the CLI's what/why/fix prose uses. For a remote origin +// it is OWNER/REPO[@REF]; for a LOCAL origin (no Owner/Repo) it is the RepoURL, +// since that is the only identity the user configured. +func (r FetchRequest) originRef() string { + if r.Owner == "" && r.Repo == "" { + return r.RepoURL + } + + ref := r.Owner + "/" + r.Repo + if r.Ref != "" { + ref += "@" + r.Ref + } + + return ref +} + +// FetchSkill fetches one skill subtree from the remote OWNER/REPO origin at +// req.Ref and returns the local temp path plus the resolved upstream commit. It +// resolves a GitHub token via ResolveGitHubToken (skipped silently when none is +// available — a public origin needs none), sparse-checks-out req.SkillPath via +// the git.go FetchSparse helper (one git transport, research D7), then resolves +// HEAD to the exact commit for the lock's provenance field. +// +// All failures are classified, never surfaced raw: +// - a network/auth/missing-repo git failure → ClassifyGitError's AuthError / +// UnreachableError / NotFoundError; +// - a NotFoundError reached with NO resolved token is marked unauthenticated so +// the CLI can add the "if private, authenticate" hint (research D4); +// - a failed checkout of a --pin ref → NoSuchVersionError (C2), distinct from a +// missing skill or origin. +// +// On any failure the temp dir FetchSparse created is already cleaned up by that +// helper, so FetchSkill returns no path to remove. +func FetchSkill(ctx context.Context, req FetchRequest) (FetchResult, error) { + repoURL := req.cloneURL() + + // A local (file://) origin needs no credential and never produces the remote + // auth/private-not-found classes, so the token is resolved only for GitHub. + var ( + token string + authenticated bool + ) + + if !req.Local { + token, authenticated = ResolveGitHubToken(defaultGitHubHost) + } + + dir, err := FetchSparse(ctx, repoURL, req.SkillPath, req.Ref, token) + if err != nil { + return FetchResult{}, classifyFetchError(req, authenticated, err) + } + + commit, err := revParse(dir, "HEAD") + if err != nil { + return FetchResult{}, classifyFetchError(req, authenticated, err) + } + + return FetchResult{Dir: dir, Commit: commit}, nil +} + +// cloneURL derives the git transport target for the request: an explicit +// RepoURL (the LOCAL origin's file://, or any caller-supplied URL) when +// set, else the GitHub HTTPS URL for OWNER/REPO (FIX-1 — the seam that stops +// hardcoding github.com so a file:// origin is fetchable). The token is never +// embedded here — git.go injects it via the GIT_CONFIG http.extraHeader env, kept +// out of argv (research D4) — so the URL is safe to surface in diagnostics. +func (r FetchRequest) cloneURL() string { + if r.RepoURL != "" { + return r.RepoURL + } + + return "https://" + defaultGitHubHost + "/" + r.Owner + "/" + r.Repo + ".git" +} + +// classifyFetchError turns a raw fetch failure into the renderable typed error +// the CLI branches on. It runs the shared ClassifyGitError mapping +// (Auth/Unreachable/NotFound), then applies the fetch-specific refinements +// ClassifyGitError cannot know on its own, all anchored on WHICH git phase failed +// (FIX-3 — the *fetchStepError tag): +// +// - A failed CHECKOUT of a --pin ref is a missing VERSION (NoSuchVersionError, +// C2): the repo and skill subtree were cloned fine, only the requested +// tag/SHA does not exist. This is the ONLY path that yields NoSuchVersion — a +// missing/private/unreachable REPO (a clone-phase failure) never becomes +// "no such version" even with --pin (FIX-3 fixes that mis-classification). +// - A genuine clone-phase NotFound gets the origin/skill identity plus the +// no-token authentication flag (D4). +// - Auth/Unreachable get the origin identity populated (FIX-7), which +// ClassifyGitError leaves blank. +// +// A non-*GitError (e.g. an os error from temp-dir creation) is returned +// unchanged. +func classifyFetchError(req FetchRequest, authenticated bool, err error) error { + var gitErr *GitError + if !errors.As(err, &gitErr) { + return err + } + + // The phase that failed: a checkout-step failure is about ref/version + // existence; everything else (clone, sparse-cone, object fetch) is about the + // repo. Absent a step tag (e.g. a rev-parse after the fetch), treat it as a + // clone-class repo failure — never a missing version. + var stepErr *fetchStepError + + checkoutFailed := errors.As(err, &stepErr) && stepErr.step == stepCheckout + + classified := ClassifyGitError(gitErr) + + // A pin whose CHECKOUT failed is a missing version, not a missing skill: the + // skill exists, the requested tag/SHA does not (C2). Gated on the checkout + // phase so a missing/private repo with --pin stays a NotFound (FIX-3). + if req.Pinned && checkoutFailed { + return &NoSuchVersionError{ + Skill: req.Skill, + Ref: req.Ref, + Cause: gitErr, + } + } + + // FIX-7: ClassifyGitError builds Auth/Unreachable with a blank Origin; rebuild + // each with the configured origin identity. NotFound additionally carries the + // skill name and the no-token flag for the "if private, authenticate" hint + // (D4): GitHub reports "not found" (not 403) for a private repo reached + // without a token. + var ( + authErr *AuthError + unreach *UnreachableError + notFound *NotFoundError + ) + + switch { + case errors.As(classified, &authErr): + return &AuthError{Origin: req.originRef(), Cause: gitErr} + case errors.As(classified, &unreach): + return &UnreachableError{Origin: req.originRef(), Cause: gitErr} + case errors.As(classified, ¬Found): + return &NotFoundError{ + Origin: req.originRef(), + Skill: req.Skill, + Authenticated: authenticated, + Cause: gitErr, + } + default: + return classified + } +} diff --git a/pkg/skillcore/fetch_test.go b/pkg/skillcore/fetch_test.go new file mode 100644 index 0000000..4e22e05 --- /dev/null +++ b/pkg/skillcore/fetch_test.go @@ -0,0 +1,404 @@ +package skillcore + +import ( + "errors" + "testing" +) + +// TestResolvePin pins the deterministic --pin resolution rule (C3, data-model +// §3): a bare semver (optionally v-prefixed) expands via the name-vSEMVER tag +// scheme; a full tag or commit SHA is taken literally; an empty pin falls back +// to the origin branch ref and records no explicit version. resolvePin is the one +// implementation add dispatches to, so the rule is asserted directly on it rather +// than only end-to-end. +func TestResolvePin(t *testing.T) { + t.Parallel() + + const ( + skill = "terraform-plan-review" + fallback = "main" + // sha is a 40-hex commit id — neither a bare semver nor a name-vX tag. + sha = "0123456789abcdef0123456789abcdef01234567" + ) + + tests := []struct { + name string + pin string + wantRef string + wantVersion string + wantPinned bool + }{ + { + name: "empty pin falls back to the origin branch ref, unpinned", + pin: "", + wantRef: fallback, + wantVersion: "", + wantPinned: false, + }, + { + name: "bare v-prefixed semver expands to the name-vSEMVER tag", + pin: "v1.4.0", + wantRef: skill + "-v1.4.0", + wantVersion: skill + "-v1.4.0", + wantPinned: true, + }, + { + name: "bare semver without v expands to the same name-vSEMVER tag", + pin: "1.4.0", + wantRef: skill + "-v1.4.0", + wantVersion: skill + "-v1.4.0", + wantPinned: true, + }, + { + name: "full tag is taken literally (no re-expansion)", + pin: skill + "-v1.4.0", + wantRef: skill + "-v1.4.0", + wantVersion: skill + "-v1.4.0", + wantPinned: true, + }, + { + name: "commit SHA is a literal ref", + pin: sha, + wantRef: sha, + wantVersion: sha, + wantPinned: true, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + ref, version, pinned := resolvePin(skill, tt.pin, fallback) + + if ref != tt.wantRef { + t.Errorf("resolvePin(%q) ref = %q, want %q", tt.pin, ref, tt.wantRef) + } + + if version != tt.wantVersion { + t.Errorf("resolvePin(%q) version = %q, want %q", tt.pin, version, tt.wantVersion) + } + + if pinned != tt.wantPinned { + t.Errorf("resolvePin(%q) pinned = %v, want %v", tt.pin, pinned, tt.wantPinned) + } + }) + } +} + +// TestResolvePin_BareSemverEqualsFullTag is the SC-004 invariant at the resolver +// level: `--pin v1.4.0` (bare-semver expansion) and `--pin -v1.4.0` (the +// full-tag literal) MUST resolve to the SAME git ref, so they fetch the same +// commit and thus the same treeSha. The end-to-end equivalence is also asserted +// by TestQuickstart_AddPinTagFormEquivalent over a real file:// origin; this pins +// the rule that makes that possible. +func TestResolvePin_BareSemverEqualsFullTag(t *testing.T) { + t.Parallel() + + const skill = "terraform-plan-review" + + bareRef, _, barePinned := resolvePin(skill, "v1.4.0", "main") + tagRef, _, tagPinned := resolvePin(skill, skill+"-v1.4.0", "main") + + if bareRef != tagRef { + t.Errorf("bare-semver ref %q != full-tag ref %q (SC-004: both forms must resolve to one tag)", bareRef, tagRef) + } + + if !barePinned || !tagPinned { + t.Errorf("both forms must be pinned, got bare=%v tag=%v", barePinned, tagPinned) + } +} + +// classifyClass names the typed fetch-error class classifyFetchError can produce, +// so the table asserts the matched class declaratively. +type classifyClass int + +const ( + classNoVersion classifyClass = iota + classFetchAuth + classFetchUnreachable + classFetchNotFound + classFetchRaw +) + +func (c classifyClass) String() string { + switch c { + case classNoVersion: + return "NoSuchVersionError" + case classFetchAuth: + return "AuthError" + case classFetchUnreachable: + return "UnreachableError" + case classFetchNotFound: + return "NotFoundError" + case classFetchRaw: + return "raw *GitError" + default: + return "unknown" + } +} + +// classifyClassOf reports which class classifyFetchError mapped err into. +func classifyClassOf(err error) classifyClass { + var ( + nv *NoSuchVersionError + a *AuthError + u *UnreachableError + nf *NotFoundError + ) + + switch { + case errors.As(err, &nv): + return classNoVersion + case errors.As(err, &a): + return classFetchAuth + case errors.As(err, &u): + return classFetchUnreachable + case errors.As(err, &nf): + return classFetchNotFound + default: + return classFetchRaw + } +} + +// TestClassifyFetchError pins the repo-vs-skill-vs-version distinction (FIX-3 / +// C2/C3): a missing/private/unreachable REPO is a CLONE-phase failure and must +// NEVER become NoSuchVersionError, even with --pin; only a CHECKOUT-phase failure +// of a pinned ref is a missing VERSION (the repo and skill cloned fine, the +// tag/SHA does not exist). It also pins FIX-7 — Auth/Unreachable/NotFound carry +// the configured origin identity that ClassifyGitError leaves blank. +func TestClassifyFetchError(t *testing.T) { + t.Parallel() + + const ( + owner = "my-org" + repo = "my-skills" + skill = "terraform-plan-review" + ref = skill + "-v9.9.9" + ) + + // step wraps a *GitError in a fetchStepError for the named phase, mirroring + // what fetchSparseInto produces. A nil step means "no step tag" (e.g. a + // post-fetch rev-parse failure) — classifyFetchError must treat that as a + // clone-class repo failure, never a missing version. + cloneErr := func(stderr string) error { + return &fetchStepError{step: stepClone, err: &GitError{ExitCode: 128, Stderr: stderr}} + } + checkoutErr := func(stderr string) error { + return &fetchStepError{step: stepCheckout, err: &GitError{ExitCode: 128, Stderr: stderr}} + } + + const ( + notFoundStderr = "fatal: repository 'https://github.com/my-org/my-skills/' not found" + authStderr = "remote: Authentication failed for 'https://github.com/my-org/my-skills/'" + unreachStderr = "fatal: unable to access '...': Could not resolve host: github.com" + refStderr = "error: pathspec 'terraform-plan-review-v9.9.9' did not match any file(s) known to git" + ) + + tests := []struct { + name string + req FetchRequest + authed bool + err error + want classifyClass + }{ + { + name: "pinned CHECKOUT failure is a missing version (the repo cloned, the tag does not exist)", + req: FetchRequest{Owner: owner, Repo: repo, Skill: skill, Ref: ref, Pinned: true}, + err: checkoutErr(refStderr), + want: classNoVersion, + }, + { + name: "pinned CLONE not-found is a missing/private REPO, NOT a missing version (FIX-3)", + req: FetchRequest{Owner: owner, Repo: repo, Skill: skill, Ref: ref, Pinned: true}, + err: cloneErr(notFoundStderr), + want: classFetchNotFound, + }, + { + name: "unpinned checkout failure is not a version error (no pin to promote)", + req: FetchRequest{Owner: owner, Repo: repo, Skill: skill, Ref: "main"}, + err: checkoutErr(refStderr), + want: classFetchRaw, + }, + { + name: "clone auth failure classifies as AuthError", + req: FetchRequest{Owner: owner, Repo: repo, Skill: skill}, + err: cloneErr(authStderr), + want: classFetchAuth, + }, + { + name: "clone unreachable classifies as UnreachableError", + req: FetchRequest{Owner: owner, Repo: repo, Skill: skill}, + err: cloneErr(unreachStderr), + want: classFetchUnreachable, + }, + { + name: "pinned clone auth failure stays AuthError (a pin never overrides a clone-phase class)", + req: FetchRequest{Owner: owner, Repo: repo, Skill: skill, Ref: ref, Pinned: true}, + err: cloneErr(authStderr), + want: classFetchAuth, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + out := classifyFetchError(tt.req, tt.authed, tt.err) + + if got := classifyClassOf(out); got != tt.want { + t.Fatalf("classifyFetchError class = %v, want %v (err: %v)", got, tt.want, out) + } + + // Every classified error must still unwrap to the raw *GitError so + // --verbose can reach the original stderr (errors-as-navigation). + var ge *GitError + if !errors.As(out, &ge) { + t.Fatalf("classified error %v does not unwrap to *GitError", out) + } + }) + } +} + +// TestClassifyFetchError_PopulatesOrigin pins FIX-7: ClassifyGitError builds +// Auth/Unreachable/NotFound with a blank Origin; classifyFetchError must rebuild +// each with the configured origin identity so the CLI's what/why/fix names the +// real origin. NotFound additionally carries the skill name and the no-token flag +// for the "if private, authenticate" hint (D4). +func TestClassifyFetchError_PopulatesOrigin(t *testing.T) { + t.Parallel() + + const ( + owner = "my-org" + repo = "my-skills" + skill = "terraform-plan-review" + wantOrig = "my-org/my-skills" + ) + + base := FetchRequest{Owner: owner, Repo: repo, Skill: skill} + + clone := func(stderr string) error { + return &fetchStepError{step: stepClone, err: &GitError{ExitCode: 128, Stderr: stderr}} + } + + t.Run("AuthError carries the origin", func(t *testing.T) { + t.Parallel() + + out := classifyFetchError(base, true, clone("remote: Authentication failed for 'x'")) + + var ae *AuthError + if !errors.As(out, &ae) { + t.Fatalf("error = %T, want *AuthError", out) + } + + if ae.Origin != wantOrig { + t.Errorf("AuthError.Origin = %q, want %q (FIX-7)", ae.Origin, wantOrig) + } + }) + + t.Run("UnreachableError carries the origin", func(t *testing.T) { + t.Parallel() + + out := classifyFetchError(base, false, clone("fatal: Could not resolve host: github.com")) + + var ue *UnreachableError + if !errors.As(out, &ue) { + t.Fatalf("error = %T, want *UnreachableError", out) + } + + if ue.Origin != wantOrig { + t.Errorf("UnreachableError.Origin = %q, want %q (FIX-7)", ue.Origin, wantOrig) + } + }) + + t.Run("NotFound without a token records the no-token flag for the auth hint", func(t *testing.T) { + t.Parallel() + + // authenticated=false → the CLI adds the "if private, authenticate" hint. + out := classifyFetchError(base, false, clone("fatal: repository 'x' not found")) + + var nf *NotFoundError + if !errors.As(out, &nf) { + t.Fatalf("error = %T, want *NotFoundError", out) + } + + if nf.Origin != wantOrig { + t.Errorf("NotFoundError.Origin = %q, want %q", nf.Origin, wantOrig) + } + + if nf.Skill != skill { + t.Errorf("NotFoundError.Skill = %q, want %q", nf.Skill, skill) + } + + if nf.Authenticated { + t.Error("NotFoundError.Authenticated = true, want false (no token resolved → auth hint)") + } + }) +} + +// TestClassifyGitError_LocalUnreachable pins F3: a missing/unreachable LOCAL +// (file:// or path) origin fails with git's local anchors — "does not appear to +// be a git repository" / "Could not read from remote repository" — which carry +// no "not found" substring, so they must classify as *UnreachableError rather +// than falling through to a generic *GitError. The host/connect anchors and the +// auth/not-found classes stay in their own buckets (no overlap). +func TestClassifyGitError_LocalUnreachable(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + stderr string + want classifyClass + }{ + { + name: "local path is not a git repository (the F3 anchor)", + stderr: "fatal: '/x' does not appear to be a git repository", + want: classFetchUnreachable, + }, + { + name: "could not read from a missing local remote", + stderr: "fatal: Could not read from remote repository.\nPlease make sure you have the correct access rights", + want: classFetchUnreachable, + }, + { + name: "host resolution still classifies as unreachable (unchanged)", + stderr: "fatal: unable to access '...': Could not resolve host: github.com", + want: classFetchUnreachable, + }, + { + name: "auth is still matched before unreachable (no conflict)", + stderr: "remote: Authentication failed for 'https://github.com/my-org/my-skills/'", + want: classFetchAuth, + }, + { + name: "repository-not-found is unaffected (no local anchor matches)", + stderr: "fatal: repository 'https://github.com/my-org/my-skills/' not found", + want: classFetchNotFound, + }, + { + name: "an unrecognized stderr stays a raw *GitError", + stderr: "fatal: something else entirely", + want: classFetchRaw, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + in := &GitError{ExitCode: 128, Stderr: tt.stderr} + + out := ClassifyGitError(in) + + if got := classifyClassOf(out); got != tt.want { + t.Fatalf("ClassifyGitError class = %v, want %v (err: %v)", got, tt.want, out) + } + + // The raw *GitError must always remain reachable for --verbose. + var ge *GitError + if !errors.As(out, &ge) { + t.Fatalf("classified error %v does not unwrap to *GitError", out) + } + }) + } +} diff --git a/pkg/skillcore/git.go b/pkg/skillcore/git.go index 97a219c..bf61af8 100644 --- a/pkg/skillcore/git.go +++ b/pkg/skillcore/git.go @@ -3,7 +3,9 @@ package skillcore import ( "bytes" "context" + "encoding/base64" "errors" + "os" "os/exec" "strings" ) @@ -26,14 +28,38 @@ func newGitClient() *gitClient { // run invokes git with args, capturing stdout and stderr into buffers. On a // non-zero exit it returns a *GitError carrying the exit code and trimmed -// stderr; on success it returns the trimmed stdout. +// stderr; on success it returns the trimmed stdout. It inherits the parent +// environment unchanged (no credential injection — see runEnv for that). func (c *gitClient) run(ctx context.Context, args ...string) (string, error) { + return c.runEnv(ctx, nil, args...) +} + +// runEnv is run with extra environment variables. When env is non-empty it is +// appended to the parent environment — the seam that injects a GitHub credential +// via git's GIT_CONFIG_* vars (authConfigEnv) so the token lands in the process +// ENVIRON, never argv (gh-cli keeps the token out of argv too; a `-c +// http.extraHeader=...` flag would be visible in `ps`). When env is empty cmd.Env +// is left as the command context set it (nil in production → the child inherits +// the parent env unchanged). +// +// The base for the append is any cmd.Env the command context already populated, +// falling back to os.Environ() when it left it nil; this both yields the real +// parent env in production AND preserves an env the test seam pre-set on the cmd. +func (c *gitClient) runEnv(ctx context.Context, env []string, args ...string) (string, error) { var stdout, stderr bytes.Buffer cmd := c.commandContext(ctx, "git", args...) cmd.Stdout = &stdout cmd.Stderr = &stderr + if len(env) > 0 { + if cmd.Env == nil { + cmd.Env = os.Environ() + } + + cmd.Env = append(cmd.Env, env...) + } + if err := cmd.Run(); err != nil { // A non-zero git exit surfaces as *exec.ExitError carrying the code; // any other failure (e.g. git not on PATH) has no exit code, so we @@ -85,6 +111,212 @@ func (c *gitClient) statusPorcelain(gitDir, relPath string) (string, error) { ) } +// authConfigEnv returns the GIT_CONFIG_* environment variables (git >=2.31) that +// inject token as an HTTP Basic http.extraHeader credential. Passing the config +// through the ENVIRON — not a `-c http.extraHeader=...` argv flag — keeps the +// base64 credential out of the process argv (where `ps` would expose it); gh-cli +// keeps its token out of argv for the same reason (research D4). The token never +// appears in the clone URL either. An empty token yields no env (unauthenticated +// fetch). Callers thread the result to runEnv. +func authConfigEnv(token string) []string { + if token == "" { + return []string{} + } + + // GitHub accepts any non-empty username with a token as the password; the + // conventional "x-access-token" username matches gh's own header. + basic := base64.StdEncoding.EncodeToString([]byte("x-access-token:" + token)) + + // GIT_CONFIG_COUNT=N with GIT_CONFIG_KEY_i/GIT_CONFIG_VALUE_i pairs is git's + // environment form of `-c =` — the value never reaches argv. + return []string{ + "GIT_CONFIG_COUNT=1", + "GIT_CONFIG_KEY_0=http.extraHeader", + "GIT_CONFIG_VALUE_0=Authorization: Basic " + basic, + } +} + +// Clone runs a partial, sparse, no-checkout clone of repoURL into destDir, +// authenticating with token when non-empty (research D7: one git transport for +// both skill subtrees and the catalog). It fetches no blobs and lays down no +// working tree yet — the caller selects paths via FetchSparse-style checkout. +// A non-zero git exit surfaces as *GitError (the stub seam classifies it). +func (c *gitClient) Clone(ctx context.Context, repoURL, destDir, token string) error { + if strings.HasPrefix(repoURL, "-") { + return &GitError{ + ExitCode: -1, + Stderr: "refusing to clone a URL that begins with '-': " + repoURL, + } + } + + _, err := c.runEnv( + ctx, + authConfigEnv(token), + "clone", + "--filter=blob:none", + "--sparse", + "--no-checkout", + "--", + repoURL, + destDir, + ) + + return err +} + +// FetchSparse sparse-checks-out a single skillPath from repoURL at ref into a +// fresh temp dir and returns that dir. It clones (partial + sparse + no-checkout) +// into the temp dir, narrows the sparse cone to skillPath, then checks out ref — +// so only that subtree materializes on disk. token is injected per git +// invocation via the GIT_CONFIG http.extraHeader env (kept out of argv) when +// non-empty (research D4/D7). +// +// The returned dir is the caller's to remove. On any git failure the temp dir is +// cleaned up and a *GitError is returned (the stub seam classifies exit/stderr). +func (c *gitClient) FetchSparse( + ctx context.Context, + repoURL, skillPath, ref, token string, +) (string, error) { + if strings.HasPrefix(skillPath, "-") { + return "", &GitError{ + ExitCode: -1, + Stderr: "refusing to use a path that begins with '-': " + skillPath, + } + } + + if strings.HasPrefix(ref, "-") { + return "", &GitError{ + ExitCode: -1, + Stderr: "refusing to use a ref that begins with '-': " + ref, + } + } + + tmpDir, err := os.MkdirTemp("", "skillrig-fetch-*") + if err != nil { + return "", err + } + + if err := c.fetchSparseInto(ctx, tmpDir, repoURL, skillPath, ref, token); err != nil { + // Best-effort cleanup; the git failure is the error worth surfacing. + _ = os.RemoveAll(tmpDir) + + return "", err + } + + return tmpDir, nil +} + +// fetchSparseInto performs the git steps of FetchSparse against an existing dir, +// keeping FetchSparse's temp-dir lifecycle (create/cleanup) separate from the git +// sequence so the error path has a single cleanup site. +// +// It distinguishes the two failure phases (FIX-3): a failure in the CLONE phase +// (the clone, or the post-clone fetch of an off-tip ref) is a repo/auth/ +// unreachable problem; a failure in the CHECKOUT of ref is a missing VERSION. +// Each phase's *GitError is wrapped in a *fetchStepError so classifyFetchError can +// promote only a checkout-step failure of a --pin to NoSuchVersionError. +// +// FIX-6: a commit SHA pinned with --pin is not reachable by the tip-only sparse +// clone, so an off-tip ref is materialized with an explicit `git fetch origin +// ` before checkout. A branch/tag already present from the clone makes the +// fetch a harmless no-op-equivalent; a fetch failure is folded into the clone +// phase (the object is unreachable in the repo, same class as a bad clone). +func (c *gitClient) fetchSparseInto( + ctx context.Context, + dir, repoURL, skillPath, ref, token string, +) error { + if err := c.Clone(ctx, repoURL, dir, token); err != nil { + return &fetchStepError{step: stepClone, err: err} + } + + // The token rides in the GIT_CONFIG env (kept out of argv), threaded to each + // authenticated invocation via runEnv. + auth := authConfigEnv(token) + + if _, err := c.runEnv(ctx, auth, "-C", dir, "sparse-checkout", "set", "--", skillPath); err != nil { + return &fetchStepError{step: stepClone, err: err} + } + + // Materialize an off-tip ref (an arbitrary commit SHA, FIX-6). If ref is + // already a fetched tip (branch/tag), this fetch fails harmlessly; only the + // checkout below is authoritative for ref existence, so a fetch failure is + // classified with the clone phase, never as a missing version. + _, _ = c.runEnv(ctx, auth, "-C", dir, "fetch", "--depth", "1", "origin", ref) + + if _, err := c.runEnv(ctx, auth, "-C", dir, "checkout", ref); err != nil { + return &fetchStepError{step: stepCheckout, err: err} + } + + return nil +} + +// fetchStep identifies which phase of fetchSparseInto failed, so the failure can +// be classified as a repo problem (clone) versus a missing version (checkout). +type fetchStep int + +const ( + stepClone fetchStep = iota // clone / sparse-cone / object fetch — repo/auth/unreachable + stepCheckout // checkout of the requested ref — version existence +) + +// fetchStepError wraps a *GitError with the phase that produced it. It is +// internal to skillcore's fetch layer; classifyFetchError unwraps it to decide +// whether a --pin failure is a missing version (checkout) or a missing/private +// repo (clone). It still unwraps to the underlying *GitError for --verbose. +type fetchStepError struct { + step fetchStep + err error +} + +func (e *fetchStepError) Error() string { return e.err.Error() } +func (e *fetchStepError) Unwrap() error { return e.err } + +// FetchFile fetches the bytes of a single repo-relative file from repoURL at ref +// without materializing a working tree: it clones partial + no-checkout into a +// fresh temp dir, then `git show :` streams the blob (the partial +// clone fetches just that object on demand). token is injected per invocation via +// the GIT_CONFIG http.extraHeader env (kept out of argv) when non-empty (research +// D4/D7). It is the catalog fetch's +// transport (FetchCatalog) — one git transport for both the skill subtree and +// index.json. The temp dir is removed before returning; only the bytes survive. +func (c *gitClient) FetchFile( + ctx context.Context, + repoURL, file, ref, token string, +) ([]byte, error) { + if strings.HasPrefix(file, "-") { + return nil, &GitError{ + ExitCode: -1, + Stderr: "refusing to use a path that begins with '-': " + file, + } + } + + if strings.HasPrefix(ref, "-") { + return nil, &GitError{ + ExitCode: -1, + Stderr: "refusing to use a ref that begins with '-': " + ref, + } + } + + tmpDir, err := os.MkdirTemp("", "skillrig-catalog-*") + if err != nil { + return nil, err + } + + defer func() { _ = os.RemoveAll(tmpDir) }() + + if err := c.Clone(ctx, repoURL, tmpDir, token); err != nil { + return nil, &fetchStepError{step: stepClone, err: err} + } + + // The token rides in the GIT_CONFIG env (kept out of argv), not a `-c` flag. + out, err := c.runEnv(ctx, authConfigEnv(token), "-C", tmpDir, "show", ref+":"+file) + if err != nil { + return nil, &fetchStepError{step: stepCheckout, err: err} + } + + return []byte(out), nil +} + // revParse runs `git -C rev-parse ` using the default client (the // real git binary). It is the package-level entry point TreeSHA dispatches to; // the client method underneath stays pluggable for skillcore's own unit tests. @@ -97,3 +329,74 @@ func revParse(gitDir, rev string) (string, error) { func statusPorcelain(gitDir, relPath string) (string, error) { return newGitClient().statusPorcelain(gitDir, relPath) } + +// ResolveGitHubToken resolves a GitHub token for hostname, mirroring gh's own +// precedence (research D4): GH_TOKEN env → GITHUB_TOKEN env → `gh auth token +// --hostname `. It returns (token, true) on the first non-empty source +// and ("", false) when none yields a token. Absence is never fatal: gh missing +// from PATH or exiting non-zero (no session) is a clean skip, not an error — an +// unauthenticated fetch is still valid for a public origin. +// +// hostname is the seam for GitHub Enterprise; today callers pass "github.com". +func ResolveGitHubToken(hostname string) (string, bool) { + if token := strings.TrimSpace(os.Getenv("GH_TOKEN")); token != "" { + return token, true + } + + if token := strings.TrimSpace(os.Getenv("GITHUB_TOKEN")); token != "" { + return token, true + } + + return ghAuthToken(hostname) +} + +// ghAuthToken shells `gh auth token --hostname ` to surface a +// keyring-stored token that reading hosts.yml directly would miss. gh absent from +// PATH, or any non-zero exit (no authenticated session), is a skip — ("", false) +// — never a fatal error. +func ghAuthToken(hostname string) (string, bool) { + ghPath, err := exec.LookPath("gh") + if err != nil { + return "", false + } + + var stdout bytes.Buffer + + //nolint:gosec // G204: fixed `gh auth token` argv; hostname is the caller-controlled host seam (today "github.com"), never untrusted input. + cmd := exec.CommandContext(context.Background(), ghPath, "auth", "token", "--hostname", hostname) + cmd.Stdout = &stdout + cmd.Stderr = nil + + if err := cmd.Run(); err != nil { + return "", false + } + + token := strings.TrimSpace(stdout.String()) + if token == "" { + return "", false + } + + return token, true +} + +// FetchSparse sparse-checks-out skillPath from repoURL at ref into a fresh temp +// dir using the default client (the real git binary), authenticating with token +// when non-empty. It is the package-level entry point add's remote path +// dispatches to; the client method underneath stays pluggable for unit tests. +func FetchSparse( + ctx context.Context, + repoURL, skillPath, ref, token string, +) (string, error) { + return newGitClient().FetchSparse(ctx, repoURL, skillPath, ref, token) +} + +// FetchFile fetches the bytes of a single repo-relative file from repoURL at ref +// using the default client (the real git binary), authenticating with token when +// non-empty. It is the package-level entry point FetchCatalog dispatches to; the +// client method underneath stays pluggable for unit tests. +func FetchFile( + ctx context.Context, + repoURL, file, ref, token string, +) ([]byte, error) { + return newGitClient().FetchFile(ctx, repoURL, file, ref, token) +} diff --git a/pkg/skillcore/helpers_test.go b/pkg/skillcore/helpers_test.go index 7f6e02e..96da0f5 100644 --- a/pkg/skillcore/helpers_test.go +++ b/pkg/skillcore/helpers_test.go @@ -89,7 +89,29 @@ version = "0.50.0" optional = true ` -const sampleSkillMd = "# terraform-plan-review\n\nReview a terraform plan.\n" +// sampleSkillMd is a representative SKILL.md carrying the standard agentskills.io +// frontmatter plus the namespaced metadata.x-skillrig.* skillrig extensions. Its +// `name` is the skill directory used by bootstrapOrigin (name == dir, the parse +// contract); the Add happy path parses it via ParseManifest. +const sampleSkillMd = `--- +name: terraform-plan-review +description: Review a terraform plan for risk and drift. +metadata: + x-skillrig.namespace: my-org + x-skillrig.version: 1.4.0 + x-skillrig.convention-version: "1" + x-skillrig.topics: [platform-team, terraform, aws] + x-skillrig.requires: + - tool: oxid + version: ">=0.4.0" + source: my-org/my-skills + manager: mise +--- + +# terraform-plan-review + +Review a terraform plan. +` // bootstrapOrigin creates a real git repo in a fresh tmpDir containing a single // committed skill at skills//, returning the repo dir and skill name. The diff --git a/pkg/skillcore/manifest.go b/pkg/skillcore/manifest.go index 7545752..49d6db9 100644 --- a/pkg/skillcore/manifest.go +++ b/pkg/skillcore/manifest.go @@ -1,46 +1,188 @@ package skillcore import ( + "bytes" "fmt" "os" + "path/filepath" - "github.com/pelletier/go-toml/v2" + "gopkg.in/yaml.v3" ) -// Manifest is a parsed skill.toml. It is read-only: ParseManifest reads it and -// Add/Verify consume Name/Version, but skillcore never writes it back. +// Manifest is the machine metadata lifted from a skill's SKILL.md YAML +// frontmatter. It is read-only: ParseManifest reads it and Add/Verify consume +// Name/Version, but skillcore never writes it back. Standard agentskills.io +// fields (name, description) are read verbatim; skillrig-specific data lives +// under the standard metadata map as flat, dotted "x-skillrig.*" keys. type Manifest struct { - Name string `toml:"name"` - Version string `toml:"version"` - Namespace string `toml:"namespace"` - Description string `toml:"description"` - Tags []string `toml:"tags"` - Requires []Require `toml:"requires"` + Name string // standard frontmatter `name` + Description string // standard frontmatter `description` + Namespace string // metadata."x-skillrig.namespace" + Version string // metadata."x-skillrig.version" + Convention string // metadata."x-skillrig.convention-version" + Topics []string // metadata."x-skillrig.topics" + Requires []Require // metadata."x-skillrig.requires" } -// Require is one tool dependency declared in a skill.toml. It is parsed but is -// deliberately NOT written to the lock — the on-disk manifest stays the single -// source of truth, read later by doctor. +// Require is one tool dependency declared under metadata."x-skillrig.requires". +// It is parsed from SKILL.md frontmatter (yaml) and serialized into index.json +// and `search --json` (json). The json tags are lowercase (FIX-5) so the catalog +// emits "tool"/"version"/… as the data-model §2 specifies — without them +// encoding/json would default to PascalCase ("Tool"), breaking every catalog +// consumer. It is deliberately NOT written to the lock — the on-disk manifest +// stays the single source of truth, read later by doctor. type Require struct { - Tool string `toml:"tool"` - Version string `toml:"version"` - Source string `toml:"source"` - Manager string `toml:"manager"` + Tool string `yaml:"tool" json:"tool"` + Version string `yaml:"version" json:"version"` + Source string `yaml:"source" json:"source"` + Manager string `yaml:"manager" json:"manager"` } -// ParseManifest parses the skill.toml at path. Unknown keys are ignored for -// forward-compatibility (default Unmarshal — strict mode is deliberately off). +// frontmatter mirrors the agentskills.io SKILL.md frontmatter we consume: the +// standard top-level fields plus the free-form metadata map that carries the +// namespaced skillrig extensions. Unknown keys are ignored (forward-compat). +type frontmatter struct { + Name string `yaml:"name"` + Description string `yaml:"description"` + Metadata map[string]any `yaml:"metadata"` +} + +// ParseManifest reads the SKILL.md at path, parses its YAML frontmatter, and +// lifts the namespaced metadata."x-skillrig.*" keys into a Manifest. It +// validates that name is present and equals the skill's directory name (the +// directory containing SKILL.md) and that version is present. Unknown keys are +// ignored for forward-compatibility. func ParseManifest(path string) (Manifest, error) { - //nolint:gosec // G304: path is a skill.toml within the resolved origin/repo subtree. + //nolint:gosec // G304: path is a SKILL.md within the resolved origin/repo subtree. data, err := os.ReadFile(path) if err != nil { return Manifest{}, fmt.Errorf("reading skill manifest %q: %w", path, err) } - var m Manifest - if err := toml.Unmarshal(data, &m); err != nil { + block, err := extractFrontmatter(data) + if err != nil { + return Manifest{}, fmt.Errorf("parsing skill manifest %q: %w", path, err) + } + + var fm frontmatter + if err := yaml.Unmarshal(block, &fm); err != nil { return Manifest{}, fmt.Errorf("parsing skill manifest %q: %w", path, err) } + m := Manifest{ + Name: fm.Name, + Description: fm.Description, + Namespace: metaString(fm.Metadata, "x-skillrig.namespace"), + Version: metaString(fm.Metadata, "x-skillrig.version"), + Convention: metaString(fm.Metadata, "x-skillrig.convention-version"), + Topics: metaStrings(fm.Metadata, "x-skillrig.topics"), + Requires: metaRequires(fm.Metadata, "x-skillrig.requires"), + } + + dir := filepath.Base(filepath.Dir(path)) + if err := validateManifest(m, dir); err != nil { + return Manifest{}, fmt.Errorf("invalid skill manifest %q: %w", path, err) + } + return m, nil } + +// validateManifest enforces the manifest contract: name is required and MUST +// equal the skill directory name (removing the name/directory drift the old +// skill.toml allowed), and version is required. +func validateManifest(m Manifest, dir string) error { + if m.Name == "" { + return fmt.Errorf("frontmatter %q is required", "name") + } + + if m.Name != dir { + return fmt.Errorf("name %q must equal the skill directory %q", m.Name, dir) + } + + if m.Version == "" { + return fmt.Errorf("metadata %q is required", "x-skillrig.version") + } + + return nil +} + +// extractFrontmatter returns the YAML between the leading and the next "---" +// fences of a SKILL.md. A SKILL.md with no frontmatter block is an error: the +// machine metadata skillrig needs lives there. +func extractFrontmatter(data []byte) ([]byte, error) { + const fence = "---" + + rest, ok := bytes.CutPrefix(data, []byte(fence+"\n")) + if !ok { + // Tolerate a leading BOM/blank lines only via an exact opening fence; + // anything else means no frontmatter to read. + return nil, fmt.Errorf("missing YAML frontmatter (no leading %q fence)", fence) + } + + end := bytes.Index(rest, []byte("\n"+fence)) + if end < 0 { + return nil, fmt.Errorf("unterminated YAML frontmatter (no closing %q fence)", fence) + } + + return rest[:end+1], nil +} + +// metaString reads a string value at key from the metadata map (absent → ""). +func metaString(meta map[string]any, key string) string { + if v, ok := meta[key].(string); ok { + return v + } + + return "" +} + +// metaStrings reads a []string value at key, coercing each element via +// fmt.Sprint so a YAML list of scalars (e.g. unquoted topics) is accepted. +func metaStrings(meta map[string]any, key string) []string { + raw, ok := meta[key].([]any) + if !ok { + return nil + } + + out := make([]string, 0, len(raw)) + for _, v := range raw { + out = append(out, fmt.Sprint(v)) + } + + return out +} + +// metaRequires reads the nested x-skillrig.requires list into []Require. Each +// element is a YAML map; missing fields default to the zero string. +func metaRequires(meta map[string]any, key string) []Require { + raw, ok := meta[key].([]any) + if !ok { + return nil + } + + out := make([]Require, 0, len(raw)) + for _, item := range raw { + entry, ok := item.(map[string]any) + if !ok { + continue + } + + out = append(out, Require{ + Tool: mapString(entry, "tool"), + Version: mapString(entry, "version"), + Source: mapString(entry, "source"), + Manager: mapString(entry, "manager"), + }) + } + + return out +} + +// mapString reads a string value at key from a decoded YAML map (absent → ""). +func mapString(m map[string]any, key string) string { + if v, ok := m[key].(string); ok { + return v + } + + return "" +} diff --git a/pkg/skillcore/manifest_test.go b/pkg/skillcore/manifest_test.go index 835c730..8fd81ca 100644 --- a/pkg/skillcore/manifest_test.go +++ b/pkg/skillcore/manifest_test.go @@ -6,27 +6,65 @@ import ( "testing" ) -// TestParseManifest_RequiresAndUnknownKeys asserts the parse contract: a -// skill.toml with [[requires]] array-of-tables AND unknown keys (both top-level -// and per-require) parses into the expected Manifest, ignoring the unknowns with -// no error (forward-compat — strict mode is deliberately off). -func TestParseManifest_RequiresAndUnknownKeys(t *testing.T) { - t.Parallel() +// writeSkillMd writes content to //SKILL.md and returns the path. The +// parent directory is named after the skill so the parse contract (name == dir) +// can hold. +func writeSkillMd(t *testing.T, name, content string) string { + t.Helper() dir := t.TempDir() - writeFile(t, dir, "skill.toml", 0o644, sampleManifest) + writeFile(t, dir, filepath.Join(name, "SKILL.md"), 0o644, content) + + return filepath.Join(dir, name, "SKILL.md") +} + +// TestParseManifest_FrontmatterAndUnknownKeys asserts the parse contract: a +// SKILL.md whose YAML frontmatter carries the standard fields plus the namespaced +// metadata.x-skillrig.* extensions (including a nested requires list) AND unknown +// keys parses into the expected Manifest, ignoring the unknowns with no error +// (forward-compat). +func TestParseManifest_FrontmatterAndUnknownKeys(t *testing.T) { + t.Parallel() - got, err := ParseManifest(filepath.Join(dir, "skill.toml")) + const skillMd = `--- +name: terraform-plan-review +description: Review a terraform plan +license: MIT +metadata: + x-skillrig.namespace: dev.skillrig.samples + x-skillrig.version: 1.4.0 + x-skillrig.convention-version: "1" + x-skillrig.topics: [terraform, review] + x-skillrig.experimental: true + x-skillrig.requires: + - tool: terraform + version: ">=1.5" + source: https://releases.hashicorp.com + manager: asdf + optional: true + - tool: tflint + version: 0.50.0 +--- + +# Terraform Plan Review + +Body. +` + + path := writeSkillMd(t, "terraform-plan-review", skillMd) + + got, err := ParseManifest(path) if err != nil { t.Fatalf("ParseManifest: unexpected error: %v", err) } want := Manifest{ Name: "terraform-plan-review", - Version: "1.4.0", - Namespace: "dev.skillrig.samples", Description: "Review a terraform plan", - Tags: []string{"terraform", "review"}, + Namespace: "dev.skillrig.samples", + Version: "1.4.0", + Convention: "1", + Topics: []string{"terraform", "review"}, Requires: []Require{ { Tool: "terraform", @@ -46,15 +84,24 @@ func TestParseManifest_RequiresAndUnknownKeys(t *testing.T) { } } -// TestParseManifest_Minimal confirms a manifest with no [[requires]] parses to a -// nil/empty Requires slice (add only needs name + version). +// TestParseManifest_Minimal confirms a SKILL.md with only the required standard +// name + x-skillrig.version parses to a nil/empty Requires slice (add only needs +// name + version). func TestParseManifest_Minimal(t *testing.T) { t.Parallel() - dir := t.TempDir() - writeFile(t, dir, "skill.toml", 0o644, "name = \"solo\"\nversion = \"0.1.0\"\n") + const skillMd = `--- +name: solo +metadata: + x-skillrig.version: 0.1.0 +--- + +# solo +` - got, err := ParseManifest(filepath.Join(dir, "skill.toml")) + path := writeSkillMd(t, "solo", skillMd) + + got, err := ParseManifest(path) if err != nil { t.Fatalf("ParseManifest: %v", err) } @@ -68,30 +115,65 @@ func TestParseManifest_Minimal(t *testing.T) { } } -// TestParseManifest_Errors covers the failure surface: a missing file and -// malformed TOML must both return an error (the CLI renders it; the SDK only -// returns it). +// TestParseManifest_Errors covers the failure surface: a missing file, a +// SKILL.md with no frontmatter, malformed YAML, a name that does not match the +// directory, and a missing version must all return an error (the CLI renders it; +// the SDK only returns it). func TestParseManifest_Errors(t *testing.T) { t.Parallel() t.Run("missing file", func(t *testing.T) { t.Parallel() - _, err := ParseManifest(filepath.Join(t.TempDir(), "absent.toml")) + _, err := ParseManifest(filepath.Join(t.TempDir(), "skill", "SKILL.md")) if err == nil { t.Fatal("ParseManifest(absent): want error, got nil") } }) - t.Run("malformed toml", func(t *testing.T) { + t.Run("no frontmatter", func(t *testing.T) { + t.Parallel() + + path := writeSkillMd(t, "solo", "# solo\n\nNo frontmatter here.\n") + + _, err := ParseManifest(path) + if err == nil { + t.Fatal("ParseManifest(no frontmatter): want error, got nil") + } + }) + + t.Run("malformed yaml", func(t *testing.T) { t.Parallel() - dir := t.TempDir() - writeFile(t, dir, "skill.toml", 0o644, "name = \"x\"\nthis is = = not toml\n") + path := writeSkillMd(t, "solo", "---\nname: x\n bad: : indent\n---\n") - _, err := ParseManifest(filepath.Join(dir, "skill.toml")) + _, err := ParseManifest(path) if err == nil { t.Fatal("ParseManifest(malformed): want error, got nil") } }) + + t.Run("name mismatches directory", func(t *testing.T) { + t.Parallel() + + const skillMd = "---\nname: other\nmetadata:\n x-skillrig.version: 0.1.0\n---\n" + + path := writeSkillMd(t, "solo", skillMd) + + _, err := ParseManifest(path) + if err == nil { + t.Fatal("ParseManifest(name != dir): want error, got nil") + } + }) + + t.Run("missing version", func(t *testing.T) { + t.Parallel() + + path := writeSkillMd(t, "solo", "---\nname: solo\n---\n") + + _, err := ParseManifest(path) + if err == nil { + t.Fatal("ParseManifest(no version): want error, got nil") + } + }) } diff --git a/specledger/003-search-remote/checklists/requirements.md b/specledger/003-search-remote/checklists/requirements.md new file mode 100644 index 0000000..41aab6f --- /dev/null +++ b/specledger/003-search-remote/checklists/requirements.md @@ -0,0 +1,36 @@ +# Specification Quality Checklist: Discover & Acquire Skills (`search` + remote `add`) + +**Purpose**: Validate specification completeness and quality before proceeding to planning +**Created**: 2026-05-30 +**Feature**: [spec.md](../spec.md) + +## Content Quality + +- [x] No implementation details (languages, frameworks, APIs) +- [x] Focused on user value and business needs +- [x] Written for non-technical stakeholders +- [x] All mandatory sections completed + +## Requirement Completeness + +- [x] No [NEEDS CLARIFICATION] markers remain +- [x] Requirements are testable and unambiguous +- [x] Success criteria are measurable +- [x] Success criteria are technology-agnostic (no implementation details) +- [x] All acceptance scenarios are defined +- [x] Edge cases are identified +- [x] Scope is clearly bounded +- [x] Dependencies and assumptions identified + +## Feature Readiness + +- [x] All functional requirements have clear acceptance criteria +- [x] User scenarios cover primary flows +- [x] Feature meets measurable outcomes defined in Success Criteria +- [x] No implementation details leak into specification + +## Notes + +- The spec deliberately defers **seven open decisions** to `/specledger.clarify` (enumerated in [spec-tech.md](../spec-tech.md) §8). These are recorded as documented Assumptions in spec.md (not as `[NEEDS CLARIFICATION]` blockers), per the user's explicit instruction to route them through the separate `/clarify` step. The spec is internally consistent under those leanings; `/clarify` may revise them. +- All technical mechanics (transport, authentication, fingerprint, catalog artifact names, network test tier) live in `spec-tech.md`, keeping `spec.md` user-facing per the WRITE-OUT instruction. +- FR-023/FR-024 are process/co-evolution requirements (origin-template + roadmap/architecture updates) intentionally tracked in-spec so they are not lost. diff --git a/specledger/003-search-remote/contracts/add-remote.md b/specledger/003-search-remote/contracts/add-remote.md new file mode 100644 index 0000000..d77c5b0 --- /dev/null +++ b/specledger/003-search-remote/contracts/add-remote.md @@ -0,0 +1,45 @@ +# Contract: `skillrig add` — remote acquisition + `--pin` (Vendor Mutation) + +Extends 002's local-copy `add` with a **remote fetch** path and reproducible pinning. Byte-identical vendoring, idempotent no-op, and force-on-divergence are **unchanged from 002**; this contract adds what's new. + +## Synopsis +``` +skillrig add [--pin ] [--dry-run] [--force] [--json] [--verbose] +``` + +## Flags (new/changed) +| Flag | Meaning | +|---|---| +| `` | skill directory name (single safe path segment — 002 path-traversal guard applies) | +| `--pin ` | acquire an immutable tag/SHA (D5); distinct from the origin `@ref` branch. **Resolution order (C3, deterministic):** (1) if `` matches `^v?SEMVER$` (a bare version like `v1.4.0`/`1.4.0`) → expand via `tag_scheme name-vSEMVER` to `-v` and resolve that tag; (2) else treat `` as a **literal git ref** (a full `-vX.Y.Z` tag or a commit SHA) passed through unchanged. No ambiguity: a bare semver is always tag-expanded, anything else is literal; the two forms for the same release MUST resolve to the same commit/treeSha (asserted by quickstart). | +| `--dry-run`/`--force`/`--json`/`--verbose` | as 002 | + +## Behavior +1. Resolve origin; **classify form** (local path vs remote `OWNER/REPO` — D3). Report which form is used. +2. **Remote form:** resolve token (D4: `GH_TOKEN`→`GITHUB_TOKEN`→`gh auth token`); `git clone --sparse` the skill `path` (from the catalog or the conventional `skills/`) at `--pin` ref else origin `@ref` (D7); inject token via `git -c http.extraHeader`. + - **Local-path form:** 002 behavior on the explicit path (no fetch). +3. Compute `commit` (`git rev-parse` of the fetched ref) + `treeSha` (`skillcore.TreeSHA` of the subtree — same code `verify` uses, AP-04). +4. Vendor byte-identically into `.agents/skills//` (002 copy + symlink guard). +5. Write/refresh the lock entry: `version` = resolved tag (pin) else manifest `version`; `commit`; `treeSha`; `path` (data-model §3). +6. Idempotent: same version+content already present → `unchanged`, **exit 0**, nothing written. Divergent local content → refuse, instruct `--force` (002). + +## Output +- **Human:** one-line result (`added ()` | `unchanged` | dry-run preview) + footer (`run: skillrig verify`). Bounded lines. +- **`--json`:** the written lock entry (`version`/`commit`/`treeSha`/`path`) — structurally complete (Constitution §II). + +## Exit codes +| Code | When | +|---|---| +| 0 | vendor completed OR idempotent no-op | +| 1 | no origin; not found; auth; unreachable; convention mismatch; invalid name; symlink; divergence-without-`--force` | + +(Exit 2/3 reserved, never emitted here.) + +## Errors (distinct, exit 1 — data-model §5) +`NotFoundError` (skill not published — *if private + no token, hint to authenticate*, D4), `AuthError`, `UnreachableError`, `IncompatibleConventionError`, plus 002's `OriginNotFoundError`/`InvalidSkillNameError`/`SymlinkUnsupportedError`/overwrite-divergence. **Pin to a non-existent version → a distinct `NoSuchVersionError`** (C2) — a separate typed error from `NotFoundError` (the skill exists, the version doesn't), so callers branch on the type, not on message text. `--verbose` → raw cause. + +## Help +Purpose + ≥2 examples (`skillrig add terraform-plan-review`, `skillrig add terraform-plan-review --pin v1.4.0`). `TestQuickstart_AddHelpExamples` (002, extend for `--pin`). + +## Tests +`TestQuickstart_AddRemoteNoLocalCopy` (+ then `verify` passes), `_AddRemoteIdempotent`, `_AddPinnedReproducible` (two clean repos byte-identical), `_AddPinNotFound`, `_AddAuthFailureDistinct`, `_AddUnreachableDistinct`, `_AddPrivateNotFoundHintsAuth`, plus 002's local-path suite stays green (SC-007). Ground-truth: lock `treeSha` == raw `git ls-tree` over the `file://` bare-repo fixture. diff --git a/specledger/003-search-remote/contracts/index.md b/specledger/003-search-remote/contracts/index.md new file mode 100644 index 0000000..5572371 --- /dev/null +++ b/specledger/003-search-remote/contracts/index.md @@ -0,0 +1,42 @@ +# Contract: `skillrig index` — origin-side catalog generator (D2) + +Generates the origin's `index.json` from the skills' `SKILL.md` frontmatter. Runs **inside the origin repo** (locally or in its `index.yml` CI on merge to `main`). Origin-maintainer facing (spec US5). **Not** a consumer command and **not** one of the five cli.md consumer patterns — classify as an *origin-side generator* (propose a short cli.md note; FR-024). + +## Synopsis +``` +skillrig index [--out ] [--json] [--verbose] +``` + +## Flags +| Flag | Meaning | +|---|---| +| `--out ` | write the catalog to `` (default: `index.json` at repo root) | +| `--json` | machine summary of what was generated (counts, path) | +| `--verbose` | raw cause on error | + +## Behavior +1. Locate the origin repo root + `skills_dir` (from `.skillrig-origin.toml`; default `skills`). +2. Walk `/*/SKILL.md`; `ParseManifest` each (the **same** parser `add`/`verify`/`search` use — AP-04). +3. Project into catalog entries (`path` = dir relative to repo root); **sort by `name`** (determinism). +4. Marshal `index.json` with stable key order + trailing newline; carry `skillrigConvention` **read from the origin's `.skillrig-origin.toml` `convention_version`** (not a hardcoded constant — C7, so producer and the consumer's exact-match gate share one source of truth) + `origin`. +5. Write to `--out`. **Single-tip, full-regenerate** — overwrite wholesale; no aggregation, no GC (D2). + +## Output +- **Human:** `indexed N skills → ` + footer. Bounded. +- **`--json`:** `{ "out": "...", "skills": N, "convention": 1 }`. + +## Exit codes +| Code | When | +|---|---| +| 0 | catalog written (incl. no-change rewrite) | +| 1 | not in an origin repo / unreadable `skills_dir` / a malformed `SKILL.md` frontmatter | + +## Determinism & ground-truth (SC-009) +- Regenerating over an unchanged skill set is **byte-identical**. +- **Contract test (the oracle):** `skillrig index` over the PoC origin fixture MUST equal the committed `index.json` (producer == artifact). `TestQuickstart_IndexMatchesCommitted`. + +## Errors +Malformed frontmatter → what/why/fix naming the offending `SKILL.md`. Not-in-origin → fix (`run inside the origin repo`). `--verbose` → raw cause. + +## Tests +`TestQuickstart_IndexGenerates` (fields incl. topics present), `_IndexDeterministic` (twice → identical), `_IndexMatchesCommitted` (oracle), `_IndexMalformedFrontmatter`. diff --git a/specledger/003-search-remote/contracts/search.md b/specledger/003-search-remote/contracts/search.md new file mode 100644 index 0000000..a6592b9 --- /dev/null +++ b/specledger/003-search-remote/contracts/search.md @@ -0,0 +1,45 @@ +# Contract: `skillrig search` (Query pattern) + +Discover skills published by the resolved origin. Read-only. Reads the origin's `index.json` (D2/D7), gates `skillrigConvention` (D-convention), matches/filters **deterministically** (D8, N6). + +## Synopsis +``` +skillrig search [QUERY...] [--topic T ...] [--json] [--verbose] +``` + +## Flags +| Flag | Meaning | +|---|---| +| `[QUERY...]` | free-text query; case-insensitive **token-AND substring** over `name + description + topics` — a skill matches iff **every** whitespace-separated term is a substring of that concatenated text (D8) | +| `--topic T` | repeatable; structured filter — a skill must carry **all** requested topics (AND); exact-string, case-insensitive (A3). (Named "topic", not "tag" — git tags are version pins.) | +| `--json` | complete, untruncated machine output (every catalog field) | +| `--verbose` | raw underlying cause on error | + +## Behavior +1. Resolve origin via `config.ResolveOrigin` (env > project > global). No origin → usage error (FR exit 1). +2. Fetch `index.json` from the origin **per call** (no cache, D-catalog-fetch); for a remote origin this is a sparse `git` fetch at the resolved `@ref` (D7) with token auth (D4); for a local-path origin, read it from disk. +3. Gate `skillrigConvention` with **exact-match** (C1): `== 1` passes; **any other value — higher, lower, or absent/`0` — fails** with `IncompatibleConventionError` (FR-016), never partial results. +4. **Match** (in-memory, `pkg/skillcore`, AP-04): keep entries where every QUERY term is a substring of `name+description+topics` AND every `--topic` is present. Empty QUERY + no `--topic` ⇒ list all. +5. **Order** deterministically (D8, N6): fixed relevance bucket — exact-name `3` > name-hit `2` > topic-hit `1` > description-only `0` — then **lexicographic by `name`** (unique, total order). No fuzzy/semantic/learned ranking. +6. Render two-level output. + +## Output (Two-Level — cli.md P3) +- **Human (compact):** one line per matching skill (`name version — description` truncated) + a summary/footer hint (`N skills · run: skillrig add `). Line count ≤ matches + K (K≤5) — Constitution §II shape assertion. +- **`--json`:** `{ "origin": "...", "skills": [ {name,version,namespace,description,topics,path,requires} ] }` — complete, parseable, every field `add` needs (FR-003). +- **Empty result:** human → `no skills matched`; `--json` → `{"skills":[]}`. **Exit 0** (FR-004), data to stdout. + +## Exit codes +| Code | When | +|---|---| +| 0 | any well-formed query, including empty result | +| 1 | no origin configured; malformed origin; convention mismatch; auth/unreachable reaching the origin | + +## Errors (errors-as-navigation, all exit 1) +- no origin → what/why/fix (`skillrig init --origin …`). +- `IncompatibleConventionError`, `AuthError`, `UnreachableError` — distinct messages (data-model §5). `--verbose` shows raw cause. + +## Help (SC-008/FR-020) +Purpose line + ≥2 examples (`skillrig search terraform`, `skillrig search --topic aws`). Asserted by `TestQuickstart_SearchHelpExamples`. + +## Tests +`TestQuickstart_SearchQueryMatchesNameDesc`, `_SearchListsSkills`, `_SearchFilterByTopic`, `_SearchOrderingDeterministic`, `_SearchEmptyResult` (exit 0), `_SearchJSONComplete`, `_SearchConventionMismatch`, `_SearchHelpExamples`. diff --git a/specledger/003-search-remote/data-model.md b/specledger/003-search-remote/data-model.md new file mode 100644 index 0000000..05c8797 --- /dev/null +++ b/specledger/003-search-remote/data-model.md @@ -0,0 +1,170 @@ +# Phase 1 — Data Model: `003-search-remote` + +Entities, fields, validation, and the ground-truth samples each is anchored to (Constitution III). All types live in `pkg/skillcore` (presentation-free). Field-source decisions trace to research.md D1–D8. + +--- + +## 1. Skill manifest — `SKILL.md` frontmatter (was `skill.toml`) · D1 + +A skill is a directory containing `SKILL.md` (agent-facing instructions) whose **YAML frontmatter** carries the machine metadata. `skill.toml` is removed. + +**Standard agentskills.io fields** (verbatim): `name`, `description` (+ `license`, `compatibility`, `allowed-tools` passed through, not consumed by skillrig this slice). +**skillrig extensions** under the standard `metadata` map, namespaced `x-skillrig.*`: + +```yaml +--- +name: terraform-plan-review +description: Review a terraform plan for risk and drift. +license: MIT +metadata: + x-skillrig.namespace: my-org + x-skillrig.version: 1.4.0 + x-skillrig.convention-version: "1" + x-skillrig.topics: [platform-team, terraform, aws] + x-skillrig.requires: + - tool: oxid + version: ">=0.4.0" + source: my-org/my-skills + manager: mise + - tool: terraform + version: ">=1.6" + source: hashicorp/terraform + manager: mise +--- +# Terraform Plan Review + +``` + +**Go shape** (`manifest.go`, parsed with `gopkg.in/yaml.v3`): + +```go +type Manifest struct { + Name string // standard frontmatter `name` + Description string // standard frontmatter `description` + Namespace string // metadata."x-skillrig.namespace" + Version string // metadata."x-skillrig.version" + Convention string // metadata."x-skillrig.convention-version" + Topics []string // metadata."x-skillrig.topics" (renamed from tags — D8) + Requires []Require // metadata."x-skillrig.requires" +} +type Require struct{ Tool, Version, Source, Manager string } +``` + +- **Parsing:** read the file, split the `---`-delimited frontmatter block, `yaml.Unmarshal` into a struct with a `map[string]any` `metadata`, then lift `x-skillrig.*` keys. Unknown keys ignored (forward-compat). Mirrors `gh`'s `internal/skills/frontmatter`. +- **Validation:** `name` required and non-empty; `name` MUST equal the skill directory name (consistency, removes 002's duplication drift); `version` required for catalog entries; `topics` optional `[]string`. +- **Risk (D1):** `x-skillrig.requires` is a nested list — validate against `skills-ref validate`; fall back to a JSON-encoded string value only if a strict validator rejects the nested form. +- **Ground truth:** the migrated `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/skills/terraform-plan-review/SKILL.md` (the real PoC skill, after the `skill.toml`→frontmatter fold). + +## 2. Library catalog — `index.json` · D2 + +The origin's committed, machine-readable list of skills, produced by `skillrig index` and consumed by `search`. **Discovery-only** (no per-skill `commit`/`treeSha`). **Single-tip** (one entry per skill = the HEAD version). + +```jsonc +{ + "skillrigConvention": 1, // contract/schema version — the binary gates on this (D-convention) + "origin": "my-org/my-skills", + "skills": [ + { + "name": "terraform-plan-review", + "version": "1.4.0", // from metadata.x-skillrig.version + "namespace": "my-org", + "description": "Review a terraform plan for risk and drift.", + "topics": ["platform-team","terraform","aws"], // search --topic filters on these + "path": "skills/terraform-plan-review", // from the directory + "requires": [ {"tool":"oxid","version":">=0.4.0","source":"my-org/my-skills"} ] + } + ] +} +``` + +**Go shape** (`catalog.go`): + +```go +type Catalog struct { + SkillrigConvention int `json:"skillrigConvention"` + Origin string `json:"origin"` + Skills []CatalogEntry `json:"skills"` +} +type CatalogEntry struct { + Name, Version, Namespace, Description string + Topics []string + Path string + Requires []Require +} +``` + +- **Generation (`index`):** walk `/*/SKILL.md`, `ParseManifest` each, project into `CatalogEntry` (`path` = dir relative to repo root), sort by `name` (determinism — SC-009), marshal with stable key order + trailing newline. +- **Consumption (`search`):** parse; **gate `skillrigConvention`** with an **exact-match policy** (decided 2026-05-31, C1) — the binary supports **exactly** convention `1`; **any other value, including a lower version (`0`) or an absent/zero field, fails** with `IncompatibleConventionError` (FR-016), never partial results. (Exact-match is the YAGNI choice for v0; a forward/backward-compat window is a deliberate future change, not an accident of a `>`-only check.) +- **Validation:** `skills` sorted by `name`, unique names; every entry has `name`/`version`/`path`. +- **Determinism contract (SC-009):** `skillrig index` over a fixed skill set is byte-identical across runs; and `index`(origin fixture) == the committed `index.json` (ground-truth oracle). +- **Ground truth:** the committed `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/index.json` (regenerated by the new `index` from the migrated frontmatter). + +## 3. Lock entry — `.skillrig/skills-lock.json` · D5 (NO schema change) + +The existing `LockEntry` **already** carries `Version` — so recording the human-readable version/tag needs **no new field**; remote/pinned `add` simply populates `Version` with the **resolved tag/version**, `Commit` with the fetched commit, `TreeSha` with the computed subtree SHA. + +```go +type LockEntry struct { + Version string `json:"version"` // resolved human-readable version/tag (D5) + Commit string `json:"commit"` // provenance — exact fetched commit + TreeSha string `json:"treeSha"` // label-honesty — computed from the fetched subtree + Path string `json:"path"` // .agents/skills/ +} +``` + +- **Local-path add (002):** `Version` ← manifest `version` (unchanged). +- **Remote add:** `Version` ← the resolved tag for a `--pin` (e.g. `v1.4.0`), else the manifest `version` at the fetched ref; `Commit` ← `git rev-parse` of the fetched ref; `TreeSha` ← `skillcore.TreeSHA` of the vendored subtree (same code `verify` recomputes — AP-04). +- **`--pin` resolution (C3, deterministic — single rule):** a bare semver (`^v?SEMVER$`) is **always** expanded via `tag_scheme` (`name-vSEMVER`) to `-v`; **any other value** is a literal git ref (full tag or commit SHA) passed through. So `--pin v1.4.0` and `--pin terraform-plan-review-v1.4.0` resolve to the *same* commit → identical `commit`/`treeSha` (SC-004); a 40-hex SHA is taken literally. A `--pin` that resolves to no existing ref → `NoSuchVersionError` (§5), distinct from a missing skill. +- **Validation / `verify` (002, unchanged):** on-disk subtree tree-SHA must equal `TreeSha`; on-disk skill set must equal the locked set (orphan check). +- **Ground truth:** a real `.skillrig/skills-lock.json` written by remote `add` against the `file://` bare-repo fixture, with `treeSha` cross-checked against raw `git ls-tree`. + +## 4. Origin reference & form · D3 + +`config.Origin{Owner, Repo, Ref}` (001, unchanged grammar). New: a **classification** of the resolved origin into its form, decided in `internal/cli` (presentation/config layer) and passed to `skillcore`: + +| Form | Detected when | skillcore behavior | +|---|---|---| +| **Local path** | origin resolves to an explicit filesystem path (the configured value is a path, not `OWNER/REPO`) | operate on the local checkout (002 path, generalized to the real path) | +| **Remote** | a bare `OWNER/REPO[@REF]` | `git clone --sparse` from `https://github.com/OWNER/REPO` at `REF` (D4/D7) | + +- No tool-managed cache; no "both present" precedence (D3). The chosen form is reported to the user (FR-011). +- `@REF` = origin-level branch pointer (001); `--pin` = per-skill immutable tag/SHA (D5) — orthogonal. + +## 5. Typed errors · D4/D6 (extend `pkg/skillcore/errors.go`) + +Four new typed errors, **classified inside the fetch layer** (never in `cli`), rendered with what/why/fix by `internal/cli` (errors-as-navigation). The three network classes (`AuthError`/`UnreachableError`/`NotFoundError`) are classified from `GitError.Stderr`; `NoSuchVersionError` is raised from a failed `--pin` ref-resolution (the ref resolves to no tag/commit) — a **distinct type**, so `cli`/CI can branch on it rather than on prose (C2). All map to **exit 1** (usage/config class) this slice — exit 2/3 stay reserved. + +| Error | Trigger (git/gh stderr, exit 128) | cli rendering (what/why/fix) | +|---|---|---| +| `AuthError` | `Authentication failed` / `Invalid username or token` | "authentication failed reaching " / private origin + no valid token / `gh auth login` or set `GITHUB_TOKEN` | +| `UnreachableError` | `Could not resolve host` / `Failed to connect` | "could not reach " / network/location / check connectivity & origin spelling | +| `NotFoundError` | `repository '…' not found` (origin) or missing skill subtree | " not found in " / no such skill or private+unauthenticated / run `skillrig search`; **if private, authenticate** (the D4 subtlety: GitHub returns *not found* for private+no-token) | +| `NoSuchVersionError` (C2) | a `--pin` ref that resolves to no existing tag/commit (distinct from a missing skill — the skill exists, the requested version does not) | " has no version " / the pin does not match a released tag or commit / run `skillrig search` for the current version, or pin an existing tag | +| `IncompatibleConventionError` | `skillrigConvention != 1` (exact-match; includes a higher version, a lower version, and an absent/`0` field) | "origin uses convention vN (this tool supports exactly v1)" / tool/origin convention mismatch / update skillrig, or check the origin's `.skillrig-origin.toml` | + +Carried forward from 002 (unchanged): `OriginNotFoundError` (local-path form absent), `InvalidSkillNameError` (path-traversal guard), `SymlinkUnsupportedError`, `LockError`, `GitError` (raw, surfaced under `--verbose`). + +## 5b. Search matcher (in-memory, `pkg/skillcore`) · D8 + +Not persisted state — a pure function over the fetched `Catalog`. `Search(catalog, query []string, topics []string) []CatalogEntry`: +- **Match:** keep entries where every `query` term (lowercased) is a substring of `lower(name+" "+description+" "+join(topics))` AND every requested `topic` is present (case-insensitive exact membership). +- **Order:** stable sort by descending relevance bucket {exact-name 3, name-substring 2, topic 1, description-only 0} then ascending `name` (unique → total order). Deterministic; **no** fuzzy/semantic/learned ranking (N6). +- **Dependency:** stdlib only (`strings`, `slices`/`sort`). Scale: linear over tens–hundreds of entries (S5: index structures are YAGNI below ~10k docs). +- **Ground truth:** table-driven unit tests in `pkg/skillcore` (query→ordered names); an integration determinism assert (two runs byte-identical). + +## 6. Token (transient, never persisted) · D4 + +Not stored anywhere. `ResolveGitHubToken(hostname string) (string, bool)` returns the first of: `GH_TOKEN` env → `GITHUB_TOKEN` env → `os.exec("gh","auth","token","--hostname",hostname)` (exit 0 + non-empty). Injected per-fetch via `git -c http.extraHeader="Authorization: Basic "`. `hostname` param exists so GHE is a one-line future extension (deferred). + +## Entity relationships + +``` +Origin (local path | remote OWNER/REPO@ref) + └─ index.json (Catalog) ──generated by──> skillrig index <──ParseManifest── SKILL.md frontmatter (Manifest) + │ read by │ fetched + hashed by + ▼ ▼ + search (query + --topic filter) add (remote fetch → vendor → LockEntry{version,commit,treeSha,path}) + │ checked by + ▼ + verify (002, offline) +``` diff --git a/specledger/003-search-remote/plan.md b/specledger/003-search-remote/plan.md new file mode 100644 index 0000000..71c4d61 --- /dev/null +++ b/specledger/003-search-remote/plan.md @@ -0,0 +1,104 @@ +# Implementation Plan: Discover & Acquire Skills (`search` + remote `add` + `index`) + +**Branch**: `003-search-remote` | **Date**: 2026-05-31 | **Spec**: [spec.md](./spec.md) +**Input**: [spec.md](./spec.md) + [spec-tech.md](./spec-tech.md) + research spikes [S1](./research/2026-05-31-skill-manifest-format.md) · [S2](./research/2026-05-31-catalog-generation-lifecycle.md) · [S3](./research/2026-05-31-auth-token-resolution.md) · [S4](./research/2026-05-31-remote-git-testing.md) · [S5](./research/2026-05-31-search-index-architecture.md) + +## Summary + +Deliver the first end-to-end consumer loop — **`init` → `search` → `add` → `verify`** — plus the origin-side **`index`** generator that keeps discovery honest. A user binds a repo to a remote GitHub origin, **finds** a skill (`search` reads the origin's catalog — query-first over name+description, deterministic `--topic` filter), and **vendors** it directly from the remote (`add` fetches the subtree, no local checkout), recording `commit` + `treeSha` + resolved `version/tag` in the lock so `verify` (002) still passes. The catalog `search` reads is produced by `skillrig index` from each skill's **`SKILL.md` frontmatter** — which this slice migrates to (dropping `skill.toml`, the first build step), aligning with the agentskills.io standard. + +All five design uncertainties were resolved in spikes **before**/during planning (S1 manifest format, S2 catalog lifecycle, S3 auth, S4 testing, S5 search/index architecture); this plan consumes their conclusions and does not re-open them. + +## Technical Context + +**Language/Version**: Go 1.24+ (toolchain 1.24.4) — single static binary (unchanged). +**Primary Dependencies**: `github.com/spf13/cobra` (commands); `github.com/pelletier/go-toml/v2` (config + retained for `.skillrig/config.toml`); **NEW: `gopkg.in/yaml.v3`** (SKILL.md frontmatter — accepted 2026-05-31, the parser `gh` uses; see Complexity Tracking). Lock uses stdlib `encoding/json`. Fetch + tree-SHA via **shelling `git`** (no in-process git/hashing lib). Token via `os.exec` of `git`/`gh` (no `gh`-as-library). +**Storage**: local files only — vendored subtree `.agents/skills//`, committed lock `.skillrig/skills-lock.json`; origin-side `index.json` (committed in the origin). No DB. **No tool-managed cache** (catalog fetched per `search`). +**Testing**: `go test`, two tiers — (a) presentation-free **unit** in `internal/...` + `pkg/skillcore` (table-driven + ground-truth: fetched tree-SHA == raw `git`; `index` output == committed `index.json`); (b) **`TestQuickstart_*` integration** in `test/` building/exec'ing the real binary. **New network boundary** tested via S4's substrate: `file://` + local bare repo for happy/integrity; the existing `pkg/skillcore/git.go` `commandContext` exec-stub seam (extended to `Clone`/`FetchSparse`) for auth/unreachable/transient. **No `httptest`/go-vcr** (skillrig shells `git`, never calls the GitHub HTTP API — see Constitution Check). +**Target Platform**: developer/CI machines with `git` (and optionally `gh`) on PATH (macOS/Linux; Windows later). +**Project Type**: single project (existing two-layer Go CLI). +**Performance Goals**: interactive CLI; `search`/`add` dominated by one `git` fetch — no throughput target. Determinism is the hard requirement (SC-002/004/009), not latency. +**Constraints**: offline-deterministic test gate; errors-as-navigation; two-level output; exit codes `search`/`index` 0/1, `add` 0(incl. no-op)/1; exit 2/3 reserved (not emitted). Single origin resolver (`config.ResolveOrigin`); single `skillcore` (one fetch + one `ParseManifest`, shared by `index`/`add`/`verify`/`search` — AP-04/06). +**Scale/Scope**: one origin, tens–hundreds of skills per catalog; vendored subtrees small. 3 new/changed commands (`search`, `index`, remote `add`) + manifest migration + co-evolution docs. + +## Constitution Check + +*GATE: re-checked after Phase 1 (see end).* + +- [x] **I. Specification-First**: spec.md complete, 5 prioritized user stories, clarified + spiked. +- [x] **II. Quickstart-as-Contract**: every US (US1–US5) → a `TestQuickstart_*` scenario with **output-shape** assertions (bounded human lines; parseable+complete `--json`; 3-part errors + exit code). See quickstart.md. +- [x] **III. Ground-Truth Anchoring**: fixtures derived from real output — fetched tree-SHA == raw `git ls-tree`/`rev-parse`; `skillrig index` output == the committed origin `index.json`; manifest fixtures = real `SKILL.md` frontmatter. **Divergence (justified):** §III's "httptest + go-vcr for the GitHub path" does **not** apply — skillrig fetches via shell `git`, so the integrity boundary is the `git` exec, not an HTTP call; S4's exec-stub seam is the faithful mock. Also §III lists `skill.toml` as the index source; S1 changes it to `SKILL.md` frontmatter. Both are **constitution-doc touch-ups** flagged for the FR-024 sweep (amendment needs team approval; not changed unilaterally here). +- [x] **IV. Agent-First CLI Design**: `search` = Query, remote `add` = Vendor Mutation, `index` = origin-side generator; all navigable from `--help` (≥2 examples); errors-as-navigation; two-level output; consume-only (no write credential — token is read-only fetch auth). cli.md updated in-branch (FR-024). +- [x] **V. Code Quality (Go)**: `gofmt`/`go vet`/golangci-lint gate; presentation/execution split preserved (typed errors in `skillcore`, prose in `cli`). +- [x] **VI–VIII. YAGNI / Shortest-path / Simplicity**: catalog is single-tip full-regenerate (no aggregation/GC — S2); `index` is a thin walk+marshal reusing `ParseManifest`; no `httptest`; GHE deferred; `--pin` minimal. +- [x] **IX. Skill–CLI Co-Evolution**: extend the single consolidated `skillrig` skill — `references/search.md` + `references/index.md` (new), update `references/add.md` (remote + `--pin` + auth/unreachable errors); root routing + keywords; sonnet trigger evals. +- [ ] **Issue Tracking**: epic + per-US features to be created (`sl issue create --type epic`) at `/specledger.tasks` time (002 skipped the ledger; 003 restores it). + +**Complexity Violations**: one — a new dependency (`gopkg.in/yaml.v3`) against the "no new dependencies" note. Justified in Complexity Tracking. + +## Project Structure + +### Documentation (this feature) + +```text +specledger/003-search-remote/ +├── plan.md # this file +├── spec.md spec-tech.md +├── research/2026-05-31-*.md # S1–S5 spikes (done) +├── research.md # Phase 0 — consolidates the spikes + prior work +├── data-model.md # Phase 1 — manifest, catalog, lock, typed errors +├── quickstart.md # Phase 1 — US1–US5 → TestQuickstart_* +├── contracts/ # Phase 1 — search.md, add.md, index.md, schemas +└── tasks.md # /specledger.tasks (not this command) +``` + +### Source Code (repository root) + +```text +main.go # unchanged shim +internal/cli/ # PRESENTATION + cobra wiring only +├── root.go # + register search, index (add already registered) +├── add.go # extend: remote path, --pin, map Auth/Unreachable/NotFound +├── search.go (new) # Query: render two-level list + --json +├── index.go (new) # origin-side generate; render summary +├── output.go exit.go repo.go # reuse/extend renderers + exit mapping +internal/config/ # resolver (unchanged); origin form classification helper +pkg/skillcore/ # business logic, presentation-FREE — the single core +├── manifest.go # REWRITE: SKILL.md frontmatter (yaml.v3) + metadata.x-skillrig.* +├── fetch.go (new) # git clone --sparse over origin; token via os.exec; typed errors +├── catalog.go (new) # parse index.json (search) + generate it from frontmatter (index) +├── add.go # branch: local-path origin (002) vs remote fetch; lock w/ version/tag +├── errors.go # + AuthError, UnreachableError, NotFoundError (from GitError.Stderr) +├── git.go # + Clone/FetchSparse on the existing commandContext stub seam +├── lock.go treesha.go verify.go # lock entry gains resolved version/tag; verify unchanged logic +test/ # TestQuickstart_* (build+exec real binary) — S4 substrate +``` + +**Structure Decision**: keep the established two-layer split. **All** remote-fetch, token-resolution, catalog parse/generate, and manifest parsing land in `pkg/skillcore` (one implementation, AP-04); `internal/cli` only wires cobra + renders. New commands `search`/`index` mirror `add`/`verify` wiring. + +## Build sequence (independently testable slices) + +1. **Manifest migration (commit 1)** — rewrite `ParseManifest` to read `SKILL.md` frontmatter via `yaml.v3` (standard fields + `metadata.x-skillrig.*`); drop `skill.toml`; migrate the origin-template skill + fixtures; remove the name/description duplication. Existing `verify`/`add` tests stay green. *(S1)* +2. **`skillrig index` + contract test** — `catalog.go` generate: walk `skills/*/SKILL.md`, `ParseManifest`, marshal `index.json`; `index.go` CLI. Ground-truth: `index` over the origin fixture == committed `index.json`. *(S2)* +3. **Remote fetch layer** — `fetch.go`: `git clone --sparse`/sparse-checkout at ref; `ResolveGitHubToken(hostname)` via `os.exec` (GH_TOKEN→GITHUB_TOKEN→`gh auth token`); classify `GitError.Stderr`→`AuthError`/`NotFoundError`/`UnreachableError`; inject token via `git -c http.extraHeader`. Unit-tested on the exec-stub seam. *(S3, S4)* +4. **Remote `add`** — branch origin classification (local path vs remote `OWNER/REPO`); remote: fetch→byte-identical vendor (reuse 002 copy/treeSha)→lock with `commit`+`treeSha`+resolved `version/tag`; `--pin`; idempotent no-op; force-on-divergence; map new errors in cli. *(S3/S4 + 002 reuse)* +5. **`search`** — `catalog.go` parse; convention-version gate; deterministic query matcher (`Search(catalog, query, topics)` in `skillcore` — the signature authored in data-model §5b, C10; **stdlib-only** — case-insensitive token-AND substring over `name`+`description`+`topics`, ordered by fixed-bucket score then name) + exact-string AND `--topic` filter; two-level output. *(S2, S5)* +6. **Seed the origin with more skills (so `search` has real data to discover/filter)** — vendor several public skills into `skillrig-origin/skills/` and enrich each with `x-skillrig.*` frontmatter so the catalog carries `topics`/`version`. **Tool: `npx skills add` (NOT `sl skill add`).** Mechanics verified live (S5 addendum): + - `sl skill add` **cannot** be used here — it errors `not in a SpecLedger project` (the origin repo is not a SpecLedger project) and installs to `.claude/skills/`, not the canonical `skills/` dir. + - From `../../skillrig-origin/`, per skill run: `npx skills add --skill --copy -y`. `--copy` lands a real `skills//SKILL.md` (+ the skill's `references/`) in the origin's `skills_dir`. Candidates from the registry, e.g. `hashicorp/agent-skills@terraform-test`, `hashicorp/agent-skills@terraform-stacks`, `vercel-labs/agent-skills@creating-pr`. + - **Cleanup (required):** `npx skills add` also fans out ~25 agent-specific copies (`.claude/`, `.windsurf/`, `.cursor/`, …) — for an origin repo, commit **only** `skills//` and delete/gitignore the dot-dir copies. + - **Enrichment (required):** vendored `SKILL.md` carries standard agentskills.io frontmatter only (`name`/`description`, occasionally a `metadata.version`) — **no `metadata.x-skillrig.*`**. Add the skillrig block per skill: `x-skillrig.namespace`, `x-skillrig.version`, `x-skillrig.convention-version: "1"`, `x-skillrig.topics: [...]`, and `x-skillrig.requires` if it has backing CLIs. Without this, `skillrig index` cannot emit `topics`/`version` and `search --topic` has nothing to filter. **Enrichment is a checked precondition (C9):** `index` MUST fail clearly on a skill missing the required `x-skillrig.version` (covered by `TestQuickstart_IndexMissingVersion`), so the `IndexMatchesCommitted` oracle can't silently pass over an under-enriched seed. + - Then regenerate via `skillrig index --out index.json` (step 2) and re-assert the contract test over the now-multi-skill fixture — giving `search`/`--topic` real multi-entry coverage in `TestQuickstart_*`. +7. **Co-evolution (FR-023/024)** — origin-template (frontmatter + `index.yml` calls `skillrig index`); `docs/ROADMAP.md`, `docs/ARCHITECTURE-v0.md` (003+004 merge, local-vs-remote reframe, frontmatter+`yaml.v3`, "on merge" not "on release", mise precedence fix), `docs/design/cli.md` (search/index/remote-add surface), and the **constitution touch-ups (C14, one pass, team-approved)** — §III:78 (`skill.toml`→`SKILL.md` as the index source), §III:82-83 (httptest/go-vcr → the exec-stub seam for the shell-`git` boundary), and the stale §IX `scripts/run_eval.py` path (actual: `.agents/skills/skill-creator/scripts/run_eval.py`); extend the `skillrig` skill + sonnet evals. + +## Complexity Tracking + +| Violation | Why Needed | Simpler Alternative Rejected Because | +|-----------|------------|--------------------------------------| +| New dep `gopkg.in/yaml.v3` (vs "no new dependencies") | SKILL.md frontmatter *is* YAML; adopting the agentskills.io standard (S1) requires a YAML parser. It is the same parser `gh` uses, and it replaces `skill.toml`'s bespoke sibling-file format. | Hand-rolling a YAML subset parser is more code + more risk for a worse result; staying on `skill.toml` keeps skillrig diverged from the 26+ agentskills.io-compliant clients (portability loss) and keeps the name/description duplication drift bug. User accepted the dep 2026-05-31. | +| `index` command (origin-side generator) in a "consume-only" CLI | `search` is meaningless against a hand-maintained catalog that drifts (the shipped `build-index.sh` already drops `tags`); skillrig is the single tool for origin maintenance too (spec US5). The generator is thin (walk + shared `ParseManifest` + marshal) and reuses the consumer parser (AP-04). | Deferring to a sibling feature ships a consumer (`search`) against a known-broken producer — the false economy S2 flagged. It is not a write-credential path (no auth, local FS only), so it doesn't breach "consume-only" in the credential sense. | + +## Notes for `/specledger.tasks` +- Create the epic + 5 user-story features + the migration/co-evolution features in `sl issue` (restore the ledger 002 skipped). +- Each US → `TestQuickstart_*` with §II output-shape asserts; include the two ground-truth tests. +- Add a cli.md pattern-gate checklist task per new command (Query for `search`; Vendor Mutation for remote `add`; classify `index` as origin-side generator — propose a cli.md note since it's not one of the five consumer patterns). diff --git a/specledger/003-search-remote/quickstart.md b/specledger/003-search-remote/quickstart.md new file mode 100644 index 0000000..8b01594 --- /dev/null +++ b/specledger/003-search-remote/quickstart.md @@ -0,0 +1,71 @@ +# Quickstart — Acceptance Contract: `003-search-remote` + +Each scenario is an executable `TestQuickstart_*` (Constitution §II): concrete invocations, observable output, exit codes, and **output-shape** assertions (bounded human lines; parseable+complete `--json`; 3-part errors). Every user story (US1–US5) maps here. Tests build and exec the real binary against the S4 substrate. + +## Test substrate (S4 / D6) +- **Origin fixture** bootstrapped in `t.TempDir()`: a working tree with `index.json` + `skills/terraform-plan-review/SKILL.md` (frontmatter), committed; pushed to a local **bare** repo. The CLI's origin is `file://` for the remote-fetch path. +- **Failure injection:** the existing `pkg/skillcore/git.go` `commandContext` exec-stub seam (extended to `Clone`/`FetchSparse`) returns crafted `(exit=128, stderr=…)` for auth/unreachable/transient — `pkg/skillcore` unit tests, not integration. +- **Ground-truth oracles:** `fetched treeSha == rawTreeSHA(fixture,"HEAD","skills/")`; `skillrig index`(fixture) == committed `index.json`. + +--- + +## US1 — Discover (search) · P1 + +**`TestQuickstart_SearchListsSkills`** — Given an origin publishing ≥2 skills, `skillrig search` (no query) lists each (`name`, `version`, one-line desc) + footer hint; assert `len(lines) ≤ matches + 5`. +**`TestQuickstart_SearchQueryMatchesNameDesc`** — `skillrig search terraform plan` returns only skills whose name+description+topics contain **both** terms (token-AND substring); a skill matching one term but not the other is excluded (FR-002). +**`TestQuickstart_SearchOrderingDeterministic`** — for a query hitting several skills, results are ordered by the fixed relevance bucket then name, and are **byte-identical across two runs** (D8/N6, SC-002). +**`TestQuickstart_SearchFilterByTopic`** — `skillrig search --topic aws` lists only aws-topic skills; identical across two runs. +**`TestQuickstart_SearchEmptyResult`** — `skillrig search --topic nonesuch` → `no skills matched`, **exit 0**. +**`TestQuickstart_SearchJSONComplete`** — `--json` parses (`json.Unmarshal` ok) and every entry has name/version/namespace/description/topics/path (field-presence, not truncation). +**`TestQuickstart_SearchConventionMismatch`** — origin catalog `skillrigConvention: 2` → exit 1, message names a compatibility mismatch + "update skillrig" (3 parts). +**`TestQuickstart_SearchConventionBoundary`** (C1) — exact-match gate: `skillrigConvention: 0` **and** an absent field each → exit 1 `IncompatibleConventionError` (a lower/missing convention does **not** silently pass), while `1` passes — pinning the non-`>` boundary so FR-016/SC-005 is unambiguous. +**`TestQuickstart_SearchHelpExamples`** — `search --help` shows purpose + ≥2 examples. + +## US2 — Acquire remotely (add) · P1 + +**`TestQuickstart_AddRemoteNoLocalCopy`** — Given a `file://` origin and **no** local checkout, `skillrig add terraform-plan-review` vendors the subtree into `.agents/skills/…` byte-identical to the fixture, writes a lock entry (`version`/`commit`/`treeSha`/`path`); then **`skillrig verify` exits 0**. Ground-truth: lock `treeSha` == raw `git ls-tree`. +**`TestQuickstart_AddRemoteIdempotent`** — re-running `add` on the unchanged vendored skill → `unchanged`, **exit 0**, lock byte-unchanged, no FS change (SC-006). +**`TestQuickstart_AddRemoteForceOnDivergence`** — locally modify the vendored skill, re-`add` → refused with a `--force` hint (002 parity); `--force` overwrites. +**`TestQuickstart_AddDryRun`** (C6) — `add … --dry-run` prints a bounded preview, **exit 0**, and leaves the working tree + lock byte-unchanged (`git status --porcelain` empty, lock unchanged) — FR-020 dry-run for the remote path. +**`TestQuickstart_AddHelpExamples`** (C5) — `add --help` shows the purpose line + **≥2 runnable examples**, one of which is the `--pin` form (SC-008 for the second consumer command; bounded shape). + +## US3 — Reproducible pin · P2 + +**`TestQuickstart_AddPinnedReproducible`** — `add … --pin v1.4.0` on two clean repos → byte-identical content + identical lock (`version=1.4.0`, same `commit`/`treeSha`) (SC-004). +**`TestQuickstart_AddPinTagFormEquivalent`** (C3) — `add … --pin v1.4.0` and `add … --pin terraform-plan-review-v1.4.0` resolve to the **same** `commit`/`treeSha` (bare-semver expansion == full-tag literal), confirming the deterministic `--pin` resolution rule. +**`TestQuickstart_AddPinNotFound`** — `--pin v9.9.9` → exit 1; assert the error is a **`NoSuchVersionError`** (typed/structured discriminator, not a substring) — distinct from skill-not-found (FR-015, C2). + +## US4 — Trustworthy failures · P2 (unit-level via the stub seam + integration) + +**`TestSkillcore_ClassifyAuthError`** (unit) — stderr `Authentication failed` → `AuthError`. +**`TestSkillcore_ClassifyUnreachable`** (unit) — stderr `Could not resolve host` → `UnreachableError`. +**`TestSkillcore_ClassifyNotFound`** (unit) — stderr `repository '…' not found` → `NotFoundError`. +**`TestQuickstart_AddAuthFailureDistinct`** — injected auth failure → exit 1, message is an **authentication** failure distinct from not-found/unreachable, points at `gh auth login`/`GITHUB_TOKEN`. +**`TestQuickstart_AddPrivateNotFoundHintsAuth`** — not-found + no resolved token → message adds the "if private, authenticate" hint (D4 subtlety). +**`TestQuickstart_AddUnreachableDistinct`** — injected unreachable → exit 1, distinct message. +**`TestQuickstart_VerboseShowsRawCause`** — any of the above with `--verbose` prints the raw git/gh stderr (never swallowed). + +## US5 — Catalog generation (index) · P2 + +**`TestQuickstart_IndexGenerates`** — `skillrig index` over the origin fixture writes `index.json` whose entries match the skills' frontmatter, **including topics** (the field `build-index.sh` dropped). +**`TestQuickstart_IndexDeterministic`** — run twice on unchanged skills → byte-identical output (SC-009). +**`TestQuickstart_IndexMatchesCommitted`** — `skillrig index` output **equals** the committed PoC `index.json` (producer == artifact oracle). +**`TestQuickstart_IndexMalformedFrontmatter`** — a skill with broken frontmatter → exit 1 naming the offending `SKILL.md`. +**`TestQuickstart_IndexNotInOrigin`** (C8) — running `skillrig index` outside an origin repo (no `.skillrig-origin.toml` / unreadable `skills_dir`) → exit 1 with the what/why/fix "run inside the origin repo" navigation message. +**`TestQuickstart_IndexMissingVersion`** (C9) — a skill whose frontmatter omits the required `x-skillrig.version` → exit 1 naming the offending `SKILL.md` (the catalog-entry validation rule from data-model §1; guards the seed-enrichment precondition of `IndexMatchesCommitted`). + +## Regression (no 002 break · SC-007) +**`TestQuickstart_AddLocalPathStillWorks`** — the 002 local-path `add` suite passes unchanged against an explicit local-path origin. +**Manifest migration:** existing `verify`/`add` ground-truth + lock tests pass after `ParseManifest` is rewritten on `SKILL.md` frontmatter (the migrated fixtures replace `skill.toml`). + +--- + +### Traceability +| US | Scenarios | FRs | SCs | +|---|---|---|---| +| US1 search | SearchQueryMatchesNameDesc/ListsSkills/OrderingDeterministic/FilterByTopic/EmptyResult/JSONComplete/ConventionMismatch/ConventionBoundary/HelpExamples | 001–002a, 003, 004, 005, 016, 020, 021 | 002, 005, 008 | +| US2 add remote | AddRemoteNoLocalCopy/Idempotent/ForceOnDivergence/DryRun/HelpExamples | 006–010, 012, 020 | 001, 003, 006, 008 | +| US3 pin | AddPinnedReproducible/AddPinTagFormEquivalent/AddPinNotFound | 013–015 | 004 | +| US4 failures | Classify*/AddAuth/PrivateNotFound/Unreachable/Verbose | 016–019, 022 | 005 | +| US5 index | IndexGenerates/Deterministic/MatchesCommitted/Malformed/NotInOrigin/MissingVersion | 023, 025–028 | 009 | +| regression | AddLocalPathStillWorks + migration | 011 | 007 | diff --git a/specledger/003-search-remote/research.md b/specledger/003-search-remote/research.md new file mode 100644 index 0000000..2a2564e --- /dev/null +++ b/specledger/003-search-remote/research.md @@ -0,0 +1,86 @@ +# Phase 0 — Research: `003-search-remote` + +All design uncertainties were resolved by four time-boxed spikes during `/specledger.clarify` (2026-05-31). This file consolidates their conclusions in Decision/Rationale/Alternatives form. **No NEEDS CLARIFICATION remain.** Full writeups: `research/2026-05-31-{skill-manifest-format,catalog-generation-lifecycle,auth-token-resolution,remote-git-testing}.md`. + +## Prior work + +- **001 (epic SL-227789, closed):** `config.ResolveOrigin` (env > project > global), `OWNER/REPO[@REF]` grammar (`config.Origin{Owner,Repo,Ref}`), baseline CLI (help, errors-as-navigation, two-level output, exit codes). This slice reuses the resolver verbatim and the `@REF` branch pointer. +- **002 (merged):** `pkg/skillcore` (`ParseManifest`, `TreeSHA` via shell `git`, `Add`, `Verify`, lock, typed errors, the `git.go` `commandContext` exec-stub seam), local-copy `add` (byte-identical vendor, idempotent no-op, force-on-divergence, path-traversal + symlink guards), offline `verify`. 002 tested against a **local git working tree** (`git init`+fixtures+commit in tmpDir; no `file://`, no remote). This slice **extends** `add` with a remote fetch path and **reuses** the vendor/treeSha/lock machinery unchanged. +- 002's `add` overloaded `OWNER/REPO` as a directory `/OWNER/REPO` (the seam this slice splits into explicit-local-path vs remote-fetch). + +## D1 — Skill manifest format (Spike S1) + +**Decision:** Migrate each skill's machine metadata into **`SKILL.md` agentskills.io frontmatter**; drop the `skill.toml` sibling file. Standard fields (`name`, `description`) used verbatim; skillrig-specific data (`version`, `namespace`, `topics`, `convention-version`, `requires`) under the standard's free-form `metadata` map as **`metadata.x-skillrig.*`**. Parser: `gopkg.in/yaml.v3`. + +**Rationale:** The agentskills.io `metadata` map is the spec-sanctioned extension point (its own example puts `version` there). The Go `gh` CLI does exactly this in production (`internal/skills/frontmatter/frontmatter.go`, `yaml.v3`, flat dotted keys like `metadata.github-tree-sha`, prefixed to avoid collisions). One atomic file per skill, portability across 26+ compliant clients, no parallel format to lint, and it removes 002's latent `name`/`description` duplication-drift bug. Migration is small/in-slice (commit 1): only `pkg/skillcore/manifest.go` (~47 lines, single caller `add.go:91`); `verify.go`'s `isSkillDir` already accepts `SKILL.md`. + +**Correction to the original hypothesis:** `requires` does **NOT** go in `allowed-tools` — the standard defines `allowed-tools` as a space-separated string of agent-permission invocations (`Bash(git:*) Read`) and `gh` actively rejects an array form; `compatibility` is free-text prose. So `requires` (tool + version constraint + private `source`) lives under `metadata.x-skillrig.requires`. + +**Alternatives considered:** (a) keep `skill.toml` — rejected: diverges from the ecosystem standard, keeps the duplication bug, two formats to maintain. (b) put `requires` in `allowed-tools` — rejected: wrong semantics, `gh` rejects arrays. (c) a separate "manifest reframe" feature first — rejected: the parser is the only real change and it's ~47 lines; landing it as commit 1 of 003 means the fetch/catalog/verify code is written against the new format once. + +**Risk carried to implementation:** `metadata.x-skillrig.requires` is a nested list, bending the spec's string→string `metadata` letter. `gh`'s `map[string]interface{}` parses it fine and it's namespaced; **validate against `skills-ref validate` during build**, fall back to a JSON-encoded string only if a strict validator rejects it. + +## D2 — Catalog generation & lifecycle (Spike S2) + +**Decision:** Ship **`skillrig index`** (origin-side generator) **in 003**. The catalog is **single-tip** (reflects only the skills at the origin's selected branch/ref — one entry per skill = the HEAD version), **full-regenerated** from HEAD frontmatter on each run; **no cross-ref/version-history aggregation, no GC** (YAGNI). Version history lives in git tags, reached by `add --pin ` (D5), never via the catalog. + +**Rationale:** `search` is only as honest as the catalog; the shipped `build-index.sh` provably drifts (emits `name/version/description/path`, drops `tags`/`requires` — that is FR-023). skillrig is the single tool for origin maintenance, and the generator is thin: walk `skills/*/SKILL.md` + the **same** `ParseManifest` consumers use + marshal (AP-04 by construction). The origin's `index.yml` workflow is already authored to call it (`command -v skillrig … skillrig index --out`) and is **`push: main` (paths `skills/**`) triggered**, full-regenerating and committing if changed. Single-tip keeps the root `skillrigConvention` coherent and bounds catalog size; removed-at-HEAD skills correctly disappear from `search` while already-vendored consumers stay fine (their lock is offline-verifiable). + +**Alternatives considered:** (a) consume-only + roadmap a generator — rejected: ships `search` against a known-broken producer. (b) cross-ref aggregated catalog (all versions across tag history) — rejected: needs tag-walking, grows unbounded, breaks the single convention root; pins already cover history. (c) append-only + GC — rejected: nothing accumulates under full-regenerate, so GC is moot. + +**Contract test:** `skillrig index` over the origin fixture MUST equal the committed `index.json` (producer == artifact), mirroring the tree-SHA oracle. + +## D3 — Origin classification: local vs remote (firm decision, §8a) + +**Decision:** The origin is **either** a remote `OWNER/REPO` (fetched over the network) **or** an explicitly-configured **local filesystem path**. The tool **never** creates or caches a local copy of a remote — there is no "both present" precedence. It reports which form it used. + +**Rationale:** Confirmed against 002's code (it conflated the two by treating `OWNER/REPO` as a path). Matches the user's intent and keeps `search` correct under fetch-per-call (no stale cache to reconcile). FR-011 (local add) is preserved as the explicit-path form. + +**Alternatives considered:** tool-managed local cache (original assumption A1) — rejected by review: introduces staleness/precedence the tool can't honestly resolve. + +## D4 — Authentication / token resolution (Spike S3) + +**Decision:** Resolve a GitHub token via `os.exec`, order: **`GH_TOKEN` env → `GITHUB_TOKEN` env → `gh auth token --hostname github.com`** (exit 0 + non-empty stdout = token; non-zero = no session → skip, not fatal; `gh` absent → skip silently). Inject via **`git -c http.extraHeader="Authorization: Basic "`** — never embedded in the clone URL. Seam signature `ResolveGitHubToken(hostname string)`; **GitHub Enterprise deferred** (one-line extension later). `git credential fill` deferred. + +**Rationale:** Mirrors `gh`'s own precedence; `gh auth token` cleanly surfaces keyring-stored tokens that reading `hosts.yml` directly would miss. No `gh`-as-a-library (heavy); no bespoke credential store. `http.extraHeader` avoids token leakage via process listing / shell history. (S3 also **corrected** architecture §8b.2: mise's real precedence puts env vars before `credential_command` — doesn't affect skillrig, but the doc claim is wrong → FR-024.) + +**Failure classification** (all three exit `128`; split by `git`/`gh` **stderr**): `Authentication failed`/`Invalid username or token` → **AuthError** (FR-017); `repository '…' not found` → **NotFoundError** (FR-012); `Could not resolve host`/`Failed to connect` → **UnreachableError** (FR-018). **Private-repo subtlety:** GitHub returns *not found* (not 403) for a private repo with no/bad token, so NotFound + no resolved token MUST add the hint *"if this is a private origin, authenticate via `gh auth login` or set GITHUB_TOKEN."* + +**Alternatives considered:** vendor `gh`'s auth packages (too heavy); plain `GITHUB_TOKEN`-only (misses `gh`/keyring users). + +## D5 — Identity, fingerprint, pins (firm decisions, §8a) + +**Decision:** At add time, record in the lock entry: **`commit`** (provenance, exact upstream commit), **`treeSha`** (label-honesty, git tree-SHA computed from the fetched subtree by the *same* `skillcore` code `verify` recomputes), **and the resolved human-readable `version`/`tag`**. `--pin ` is a per-skill immutable tag/SHA (distinct from the origin-level `@ref` branch); `tag_scheme = "name-vSEMVER"` ⇒ `--pin v1.4.0` resolves to tag `terraform-plan-review-v1.4.0`. Non-existent pin → distinct "no such version" (NotFoundError variant). The origin publishes no per-skill tree-SHA, so label-honesty = "matches what was vendored," anchored by provenance — not an origin-attested hash (per-version tree-SHA publishing deferred). + +**Rationale:** A commit is opaque to humans; the tag conveys the version ordering they reason about (kept even if upstream rewrites it). `verify` then checks on-disk content against `treeSha` offline, unchanged from 002. + +## D6 — Remote-git test substrate (Spike S4) + +**Decision:** Three tiers — (1) **happy/integrity**: `file://` + a local **bare** repo in `t.TempDir()` (push the fixture working tree to a bare, point the CLI at `file://`), running the real `git clone --sparse` offline; ground-truth assertion `fetched treeSha == rawTreeSHA(fixture,"HEAD","skills/")`. (2) **FR-017/018 + transient**: extend the **existing** `pkg/skillcore/git.go` `commandContext` exec-stub seam to the new `Clone`/`FetchSparse` and inject `(exit=128, stderr=…)` — `pkg/skillcore` unit tests; classification lives in `skillcore`, not `cli`. (3) **Reject** real git-over-HTTP `httptest` (smart-HTTP/CGI handshake is fragile, OS-specific, unnecessary). + +**Rationale:** skillrig owns error-classification + rendering + exit codes; the exec-boundary seam exercises all of it deterministically and offline. (This is the justified divergence from Constitution §III's "httptest + go-vcr for the GitHub path" — skillrig shells `git`, it never calls the GitHub HTTP API, so there is no HTTP boundary to record.) + +**Not coverable offline (future E2E/manual, not gate-blockers):** GitHub's real auth handshake, mid-stream TCP abort, HTTP 429. + +**Alternatives considered:** `httptest`/go-vcr (no HTTP boundary exists to mock); `file://`-only (can't simulate auth/unreachable/transient). + +## D7 — Fetch transport (leaning → confirmed) + +**Decision:** Shell **`git` partial-clone + sparse-checkout** for both the skill subtree and the catalog file (a sparse single-file checkout of `index.json`), one transport. **Rationale:** keeps the "shell `git`, no in-process hashing dep" stance, makes the tree-SHA ground-truth trivial (git computes it), and keeps auth uniform (one `http.extraHeader` path). **Alternative:** raw HTTPS GET (`raw.githubusercontent`/contents API) for the catalog — rejected for this slice (a second transport + a second auth path for marginal latency benefit; revisit only if `git`-fetching a single file proves too slow). + +## D8 — Search algorithm, index storage, terminology (Spike S5) + +**Decision:** `search` is **query-first**. The positional `[QUERY...]` is a **case-insensitive token-AND substring** match over `name + description + topics` (a skill matches iff every whitespace-separated term is a substring of that concatenated text). `--topic` is a separate exact-string, case-insensitive, AND-across-repeats membership filter. **Ordering** is a pure deterministic function of (query, entry): a fixed relevance bucket — exact-name `3` > name-hit `2` > topic-hit `1` > description-only `0` — then **lexicographic by `name`** (unique per S2 = stable total order). Empty query + no topic ⇒ list all by name; no match ⇒ empty + exit 0. **No fuzzy/Levenshtein/embedding/TF-IDF/BM25** (N6). + +**Storage:** keep the single committed **flat `index.json`** (S2) + an **in-memory filter** at search time. A flat scan is microseconds for tens–hundreds of entries and is dwarfed by the per-call git fetch; an index structure earns its keep only at ~10k+ docs in a long-lived process — pure YAGNI here. Reject a committed inverted index (duplicate source-of-truth, doubles drift) and any binary index. + +**Dependency:** **stdlib only** (`strings.ToLower/Contains/Fields`, `slices.SortFunc`) — ~30 lines, deterministic, maximally testable. `bleve` rejected (persists a *binary* index → git-hostile + heavy); `closestmatch`/`lithammer/fuzzysearch` no win on short names. The pre-approved-dep escape hatch *if forgiving matching is ever wanted post-v0* is `github.com/sahilm/fuzzy` (pure-Go, no-cgo, deterministic) — **not** pulled in for v0. (So the user's "new deps acceptable" turned out unnecessary — a YAGNI win.) + +**Terminology:** **rename label → topic** (`--topic`, `topics[]`, `metadata.x-skillrig.topics`). The agentskills.io spec defines *no* tags/topics/keywords field, so nothing upstream breaks; GitHub's repo-"topics" reinforces the term; and it removes the collision with git-tag version pins (`--pin `, `name-vSEMVER`). Git-tag usages stay "tag." **Flag is `--topic`** (not a generic `--filter` — one dimension exists, YAGNI; pre-release means a `--filter field=value` can be added later if multiple dimensions emerge). + +**Placement / fields:** consumer-side in-memory filter over the per-call-fetched single-tip catalog; the matcher lives in `pkg/skillcore` (presentation-free, AP-04). **No new catalog field** — `name`+`description`+`topics` suffice; `description` is the ecosystem-sanctioned keyword field, so no separate `keywords`. + +**Alternatives considered:** fuzzy/typo tolerance, TF-IDF/BM25 ranking, a committed inverted index, a `keywords` field, a search cache/server — all rejected (N6 / YAGNI / git-friendliness). **Scope guard:** these stay out of v0. + +## Open items intentionally deferred (not blockers) +- GitHub Enterprise host auth (S3) · per-version tree-SHA publishing (S5/§8a) · catalog caching (D-catalog-fetch) · cross-ref aggregation (D2) · `httptest` real-HTTP coverage (D6). All recorded; none block this slice. diff --git a/specledger/003-search-remote/research/2026-05-31-auth-token-resolution.md b/specledger/003-search-remote/research/2026-05-31-auth-token-resolution.md new file mode 100644 index 0000000..6be6db2 --- /dev/null +++ b/specledger/003-search-remote/research/2026-05-31-auth-token-resolution.md @@ -0,0 +1,255 @@ +# Research: Auth / Token Resolution for Private GitHub Origins + +**Date**: 2026-05-31 +**Context**: Spike S3 from `spec-tech.md §8b` — how does `skillrig` obtain a GitHub token to fetch a PRIVATE origin? Direction was already decided: `os.exec` of `gh`/`git`, NOT vendoring `gh` auth as a library. This spike validates the exact mechanism, precedence order, and how to detect and distinguish the three failure classes (FR-017: auth, FR-018: unreachable, and "not found"). +**Time-box**: ~45 minutes + +--- + +## Question + +What is the correct token-resolution order for skillrig to use when fetching a private GitHub origin, and how should it detect/distinguish auth failure (FR-017) from "not found" (404) and "unreachable" (FR-018) when shelling git/gh? + +--- + +## Findings + +### Finding 1: gh-cli token-resolution internal chain (confidence: HIGH) + +Source: `/Users/vincentdesmet/specledger/skillrig/gh-cli` — specifically: +- `internal/config/config.go` — `AuthConfig.ActiveToken(hostname)` +- `internal/go-gh/v2/pkg/auth/auth.go` (vendored via go.mod `github.com/cli/go-gh/v2 v2.13.0`) — `TokenForHost` / `TokenFromEnvOrConfig` + +The `gh` CLI resolves a token via this internal chain (for `github.com`): + +1. **`GH_TOKEN` env var** — checked first, always wins for github.com +2. **`GITHUB_TOKEN` env var** — second for github.com +3. **`hosts.yml` `oauth_token`** (plain-text config at `~/.config/gh/hosts.yml` or `$GH_CONFIG_DIR/hosts.yml`) +4. **System keyring** — `gh auth token --secure-storage` shells to the OS keyring (macOS Keychain, Linux secret service, Windows Credential Manager) + +The function `go-gh/v2/pkg/auth.TokenFromEnvOrConfig` handles steps 1–3; step 4 is the keyring, accessed via `gh auth token --secure-storage --hostname ` (that's what `go-gh`'s `TokenForHost` does when the env/config lookup returns empty — it shells to `gh auth token --secure-storage`). + +The `gh auth token` CLI command (`pkg/cmd/auth/token/token.go`) calls `authCfg.ActiveToken(hostname)`, which delegates to `ghauth.TokenFromEnvOrConfig` then falls back to the keyring. It prints the token to **stdout**, exits **0** on success, exits **1** and prints `"no oauth token found for "` to **stderr** on failure. + +**Verified behavior** (live test): +``` +# With valid session (keyring-stored): +gh auth token → prints token to stdout, exit 0 + +# With empty config dir, no env vars, no keyring: +GH_CONFIG_DIR=/tmp/empty HOME=/tmp gh auth token +→ "no oauth token found for github.com" on stderr, exit 1 +``` + +**Key point**: `GH_TOKEN` takes priority over `GITHUB_TOKEN` inside `gh`. For `skillrig`'s own resolution before calling `gh`, it must check env vars in the same order to avoid surprising double-exec. + +--- + +### Finding 2: mise token-resolution chain (confidence: HIGH) + +Source: `/tmp/mise-spike/src/github.rs` (`resolve_token` function) and `/tmp/mise-spike/src/tokens.rs`. + +mise's full precedence for `github.com` (from doc comments in `resolve_token`): + +``` +1. MISE_GITHUB_ENTERPRISE_TOKEN (non-github.com only — skipped for github.com) +2. MISE_GITHUB_TOKEN (mise-specific env var) + GITHUB_API_TOKEN (GitHub Actions alt) + GITHUB_TOKEN (standard GitHub Actions) +3. credential_command (mise config: settings.github.credential_command — user-defined shell cmd) +4. GitHub OAuth device-flow (mise's own native OAuth cache — mise-specific) +5. github_tokens.toml ($MISE_CONFIG_DIR/github_tokens.toml — per-host TOML file) +6. gh CLI hosts.yml (reads $GH_CONFIG_DIR/hosts.yml or ~/.config/gh/hosts.yml directly as YAML) +7. git credential fill (shells `git credential fill` with protocol=https + host) +``` + +**Architecture claim vs. actual source** (spec-tech.md §8b.2 claims): +``` +credential_command > MISE_GITHUB_TOKEN > github_tokens.toml > gh hosts.yml > git credential +``` +The **actual** order from source is: +``` +MISE_GITHUB_TOKEN/GITHUB_API_TOKEN/GITHUB_TOKEN > credential_command > OAuth > github_tokens.toml > gh hosts.yml > git credential +``` +The spec-tech.md claim had `credential_command` before env vars — **this is incorrect**. Env vars win in mise too. + +**Important nuance for skillrig**: mise reads `gh`'s `hosts.yml` directly as a YAML file (parsing `oauth_token` field per host) rather than shelling to `gh auth token`. This is a deliberate choice in mise to avoid spawning a subprocess — but it **misses tokens stored only in the keyring** (since `hosts.yml` only holds plaintext `oauth_token` when `--insecure-storage` was used). If the user logged in with `gh auth login` using secure storage (the default), the `oauth_token` field in `hosts.yml` is absent, and mise's step 6 finds nothing. + +**Implication for skillrig**: shelling to `gh auth token` (no `--secure-storage`) is the correct approach — it handles both plaintext-config and keyring tokens via one exec, making it strictly more capable than mise's direct-file approach. + +The `git credential fill` subprocess in mise (`tokens.rs:get_git_credential_token`): +``` +input to stdin: "protocol=https\nhost=github.com\n\n" +parses stdout for: "password=" line +exit code: non-zero → no token (silent) +``` + +--- + +### Finding 3: `gh auth token` as the clean os.exec mechanism (confidence: HIGH) + +`gh auth token` is the right primitive for skillrig: + +- **Signature**: `gh auth token [--hostname ]` (default host = `github.com`) +- **stdout**: raw token string + newline (nothing else) +- **stderr**: error message only on failure +- **exit 0**: token available (token on stdout) +- **exit 1**: no token found (`"no oauth token found for "` on stderr) +- **exit non-zero + empty stdout**: treat as "no gh session, skip" + +The go-gh library (`TokenForHost`) uses `gh auth token --secure-storage --hostname ` internally to reach the keyring. skillrig does NOT need `--secure-storage` because `gh auth token` (without the flag) calls `ActiveToken` which already tries env, plaintext config, then keyring — it's the full chain. + +**What `gh auth token` does NOT do**: it does NOT make a network call to validate the token. It just returns whatever token gh has stored. The token may be expired or revoked; skillrig will discover this when the actual git/gh API call returns 401/403. + +--- + +### Finding 4: Distinguishing the three error classes from git/gh stderr (confidence: HIGH) + +All three failure classes result in `git ls-remote` (and `git clone`) exiting with **128** and writing to **stderr**. The exit code alone does not distinguish them. The stderr message pattern does: + +| Class | stderr contains | FR | +|---|---|---| +| **Auth failure** | `"fatal: Authentication failed for 'URL'"` | FR-017 | +| **Repo not found** | `"fatal: repository 'URL' not found"` | (not-found) | +| **Network unreachable** | `"fatal: unable to access 'URL': Could not resolve host: HOST"` | FR-018 | +| **Network timeout/refused** | `"fatal: unable to access 'URL': Failed to connect to HOST"` | FR-018 | + +**Verified live**: +```bash +# Auth failure (bad token): +git ls-remote https://x-access-token:badtoken@github.com/skillrig/origin-template +→ "remote: Invalid username or token." + "fatal: Authentication failed for 'https://github.com/skillrig/origin-template/'" + exit: 128 + +# Not found (nonexistent public repo): +git ls-remote https://github.com/skillrig/this-repo-does-not-exist-xyzabc +→ "remote: Repository not found." + "fatal: repository 'https://github.com/skillrig/this-repo-does-not-exist-xyzabc/' not found" + exit: 128 + +# Unreachable (DNS failure): +git ls-remote https://github.nonexistentdomain.invalid/owner/repo +→ "fatal: unable to access 'https://github.nonexistentdomain.invalid/owner/repo/': Could not resolve host: github.nonexistentdomain.invalid" + exit: 128 +``` + +**Note for private repos**: a private repo with no/invalid token returns a "not found" message, not "authentication failed". GitHub deliberately obscures private repos by returning 404/not-found when unauthenticated rather than 403/forbidden. This means: + +- "Repository not found" + no token → likely auth/visibility problem (warn user) +- "Repository not found" + valid token → origin typo or repo deleted +- "Authentication failed" → token exists but is wrong/revoked + +skillrig should special-case the "not found" + no-token path with a hint: `"origin not found — if this is a private repo, ensure you are authenticated (run 'gh auth login' or set GITHUB_TOKEN)"` + +--- + +### Finding 5: `git credential fill` as last resort (confidence: MEDIUM) + +mise uses `git credential fill` as its step 7 fallback. For skillrig this is a reasonable last resort but adds complexity. The call: + +```bash +printf 'protocol=https\nhost=github.com\n\n' | git credential fill +# Parses "password=" from stdout +``` + +This reaches `git`'s configured credential helper (macOS Keychain via `osxkeychain`, Windows Credential Manager via `manager`, `gh`'s credential helper if configured as `git config --global credential.helper 'gh auth git-credential'`). Since `gh auth login` sets `git config --global credential.helper gh` on newer versions, this path may already be covered by the `gh auth token` exec. Avoid adding `git credential fill` as a separate step — it would only add value for users who have git credentials configured but have not run `gh auth login`. That is a very narrow case; defer to a later iteration. + +--- + +### Finding 6: GitHub Enterprise (GHES/GHE.com) — DEFERRED + +Per the spike instructions. Note for backlog: +- `GH_ENTERPRISE_TOKEN` / `GITHUB_ENTERPRISE_TOKEN` env vars (from go-gh source) +- `gh auth token --hostname ` supports non-github.com hosts +- The origin `OWNER/REPO` parser already has room for a custom host prefix +- Design note: keep the token-resolution code path host-parameterized (pass hostname, default `"github.com"`) so GHE can be wired in without restructuring + +--- + +## Decisions + +- **D-auth-1: Token resolution order for skillrig (3 steps, in precedence order):** + 1. **`GITHUB_TOKEN` env var** (check first — matches CI/CD standard; also check `GH_TOKEN` as alias, with `GH_TOKEN` winning per gh-cli behavior) + 2. **`gh auth token` exec** — shells to `gh auth token --hostname github.com`; stdout is the token if exit 0, skip if exit non-zero + 3. **No token** → surface a typed `ErrNoAuth` immediately with an actionable message ("set GITHUB_TOKEN or run 'gh auth login'") rather than attempting the fetch unauthenticated and getting a confusing "not found" + + The `git credential fill` path (mise's step 7) is **deferred** — it only adds value for a narrow case already covered by step 2 in most developer environments. + +- **D-auth-2: When to resolve the token**: resolve lazily, at fetch time (not at `config.ResolveOrigin`). The origin may be public; don't require auth just because the config is set. Attempt unauthenticated first; if the origin is private and returns "not found", retry with the token (or prompt the user). Actually simpler: resolve the token eagerly at fetch entry, pass it to the transport — git accepts `--config http.extraheader` or the `GITHUB_TOKEN` env is picked up by `git` automatically. + +- **D-auth-3: Inject token into git via environment** — pass `GITHUB_TOKEN=` into the `git clone`/`git ls-remote` subprocess environment. Git's credential system reads it if configured via `credential.helper` with `gh auth git-credential`, but more reliably: construct the clone URL as `https://x-access-token:@github.com/OWNER/REPO` OR set the `Authorization` header via `-c http.extraheader="Authorization: Basic $(printf 'x:%s' $TOKEN | base64)"`. Using the URL with token embedded is simpler and avoids leaking to process list — prefer `-c http.extraheader` or `git credential approve` + `git clone` pattern. + + **Actually simplest**: `git clone --config "http.https://github.com.extraheader=Authorization: Basic $(printf 'x-access-token:%s' "$token" | base64)"` avoids embedding in URL. + + Even simpler for skillrig's fetch: if `gh` is available and the token came from `gh auth token`, **shell the entire fetch via `gh`**: `gh api /repos/OWNER/REPO/contents/path` or `gh api /repos/OWNER/REPO/tarball/REF` — `gh` handles auth injection automatically. Reserve raw `git` transport for when `gh` is unavailable. + +- **D-auth-4: Error classification** — Parse `stderr` from `git ls-remote` / `git clone` to distinguish: + - Contains `"Authentication failed"` or `"Invalid username or token"` → `ErrAuth` (FR-017) + - Contains `"not found"` → `ErrNotFound`; if no token was resolved, annotate: likely private origin needing auth + - Contains `"unable to access"` + `"Could not resolve host"` or `"Failed to connect"` → `ErrUnreachable` (FR-018) + - All others → wrap as `ErrUnknown` with raw stderr in `--verbose` + +- **D-auth-5: `gh auth token` output contract** — `exit 0` + non-empty stdout = token. `exit non-zero` = no session (skip, not fatal at resolution time). `gh` not found in PATH = no `gh` session available; fall back to env-only. Never treat a missing `gh` binary as an error. + +- **D-auth-6: GitHub Enterprise — deferred** to a follow-up; design the token-resolution function to accept a `hostname string` parameter (default `"github.com"`) so GHE is a one-line extension. + +--- + +## Recommendations + +1. **Implement `pkg/skillcore.ResolveGitHubToken(hostname string) (token string, source string, err error)`** with this body: + ```go + // Step 1: env vars (GH_TOKEN wins over GITHUB_TOKEN, matching gh's own precedence) + if t := os.Getenv("GH_TOKEN"); t != "" { + return t, "GH_TOKEN", nil + } + if t := os.Getenv("GITHUB_TOKEN"); t != "" { + return t, "GITHUB_TOKEN", nil + } + // Step 2: gh auth token exec + out, err := exec.Command("gh", "auth", "token", "--hostname", hostname).Output() + if err == nil && len(strings.TrimSpace(string(out))) > 0 { + return strings.TrimSpace(string(out)), "gh", nil + } + // Step 3: no token found + return "", "", nil // not an error; caller decides if token is required + ``` + Return `("", "", nil)` — callers that need a token check for empty string and surface `ErrNoAuth`. + +2. **Implement `pkg/skillcore.ClassifyGitError(stderr string, exitCode int) error`** using the patterns from Finding 4. Use the typed errors `ErrAuth`, `ErrNotFound`, `ErrUnreachable` so `internal/cli` can render distinct messages per FR-016–018. + +3. **When "not found" + no token**: render as: `"origin not found: github.com/OWNER/REPO — this may be a private origin; set GITHUB_TOKEN or run 'gh auth login'"` — not just "not found". + +4. **Auth failure wording** (FR-017): `"authentication failed for github.com/OWNER/REPO — your token may be invalid or expired; run 'gh auth login' or check GITHUB_TOKEN"`. Always include the fix. + +5. **Inject token into git subprocess** via `-c http.extraheader` flag rather than URL-embedding (avoids tokens in shell history and process table). Pattern: + ```go + headerVal := "Authorization: Basic " + base64.StdEncoding.EncodeToString([]byte("x-access-token:"+token)) + cmd := exec.Command("git", "-c", "http.extraheader="+headerVal, "ls-remote", repoURL) + ``` + +6. **Do not read `~/.config/gh/hosts.yml` directly** (like mise does). Shelling to `gh auth token` is the correct interface — it handles keyring-stored tokens that plaintext YAML misses, and it is the stable public API surface for the `gh` binary. + +7. **GitHub Enterprise**: defer. Note in `docs/ARCHITECTURE-v0.md` as backlog: `gh auth token --hostname ` and `GITHUB_ENTERPRISE_TOKEN` env var. + +--- + +## References + +- `gh-cli` token resolution (checked out at `/Users/vincentdesmet/specledger/skillrig/gh-cli`): + - `internal/config/config.go` — `AuthConfig.ActiveToken` (env → config → keyring chain) + - `internal/config/auth_config_test.go` — confirms `GH_TOKEN` > `GITHUB_TOKEN` precedence (see `TestTokenStoredInEnv`) + - `pkg/cmd/auth/token/token.go` — `gh auth token` command: stdout=token, exit 1 + stderr on failure + - Go module `github.com/cli/go-gh/v2@v2.13.0` at `/Users/vincentdesmet/go/pkg/mod/github.com/cli/go-gh/v2@v2.13.0/pkg/auth/auth.go` — `TokenFromEnvOrConfig` + `tokenFromGh` (the `--secure-storage` exec path) + +- `mise` token resolution (cloned at `/tmp/mise-spike`): + - `src/github.rs` — `resolve_token` function (lines 479–565): full 7-step precedence with doc comments + - `src/tokens.rs` — `get_credential_command_token` and `get_git_credential_token` subprocess implementations + - `src/cli/token/github.rs` — `mise token github` command (debug/diagnostics) + +- Live test observations (2026-05-31): + - `gh auth token` with empty config + no env → exit 1, `"no oauth token found for github.com"` to stderr + - `git ls-remote` with bad token → exit 128, `"fatal: Authentication failed for 'URL'"` to stderr + - `git ls-remote` on nonexistent repo → exit 128, `"fatal: repository 'URL' not found"` to stderr + - `git ls-remote` with bad hostname → exit 128, `"fatal: unable to access '...': Could not resolve host: ..."` to stderr diff --git a/specledger/003-search-remote/research/2026-05-31-catalog-generation-lifecycle.md b/specledger/003-search-remote/research/2026-05-31-catalog-generation-lifecycle.md new file mode 100644 index 0000000..8185db5 --- /dev/null +++ b/specledger/003-search-remote/research/2026-05-31-catalog-generation-lifecycle.md @@ -0,0 +1,172 @@ +# Research: Catalog Generation & Lifecycle — who builds `index.json`, single-tip vs cross-ref, and is `skillrig index` in 003? + +**Date**: 2026-05-31 +**Context**: Spike S2 for `003-search-remote` (spec-tech.md §8b). `search` reads a catalog (`index.json`) at the origin; this spike fixes the catalog's *data model* (single-tip vs cross-ref aggregated, which versions/fields), its *regeneration/GC policy*, and the *scope decision* for the origin-side generator `skillrig index`. Upstream S1 (`2026-05-31-skill-manifest-format.md`) is DECIDED: skill metadata moves to agentskills.io frontmatter in `SKILL.md`, skillrig fields under `metadata.x-skillrig.*`; `skill.toml` is dropped. So the catalog's field-source is now frontmatter, not `skill.toml`. +**Time-box**: ~30 min +**Confidence**: HIGH on the lifecycle/aggregation model (read directly off the real origin's workflows + release config); HIGH on the scope call (grounded in architecture §2/§9 single-impl rule + the existing fallback-script seam). + +## Question + +Who generates and maintains the catalog `search` reads, how, and is `skillrig index` (the origin-side generator) in 003's scope or a sibling feature? Specifically: (1) is the catalog **single-tip** (HEAD tree) or **cross-ref aggregated** (skills/versions across older tags/releases)? (2) is it **full-regenerate**, **append-only**, or does it need **GC**? (3) does building `skillrig index` belong in 003 (the catalog must actually be generatable for `search` to mean anything, and skillrig is the origin tool too), or is 003 consume-only + a contract test (FR-023) with `skillrig index` a sibling feature? + +## Findings + +### Finding 1: Today the catalog is single-tip, full-regenerate, merge-triggered — NOT release/tag-triggered + +The real origin's index workflow (`/Users/vincentdesmet/specledger/skillrig/skillrig-origin/.github/workflows/index.yml`) is unambiguous on every lifecycle axis: + +```yaml +on: + push: + branches: [main] + paths: ["skills/**", ".skillrig-origin.toml", "policy.toml"] +... +- name: Regenerate index.json + run: | + if command -v skillrig >/dev/null 2>&1; then + skillrig index --out index.json + else + ./scripts/build-index.sh > index.json + fi +- name: Commit if changed + run: | + if ! git diff --quiet -- index.json; then ... git commit -m "chore: regenerate index.json"; git push; fi +``` + +Four load-bearing facts: +1. **Trigger = push to `main`** (paths `skills/**`), **not** tag creation, **not** `release:` events. So the catalog reflects the **HEAD tree of the default branch** at each merge. +2. **Full-regenerate** — the step rebuilds `index.json` from scratch and commits *iff the file changed* (`git diff --quiet`). There is no merge/append of a previous catalog; the prior `index.json` is overwritten wholesale. A skill deleted from the HEAD tree simply vanishes from the next build (no tombstone, no GC needed — regeneration *is* the GC). +3. **The catalog walks the working tree, not git history.** `build-index.sh` (line 18) loops `for toml in skills/*/skill.toml` — it reads the checked-out directory, never `git log`/`git tag`. (Post-S1 this loop re-points at `skills/*/SKILL.md` frontmatter; the *traversal model* — "walk the current tree" — is unchanged.) +4. **`generated` banner in the committed `index.json`** says it verbatim: *"Produced by .github/workflows/index.yml running scripts/build-index.sh on merge to the default branch."* + +Architecture corroborates: §2 line 88 — *"the only write the system makes to the monorepo is `index.json` regeneration, and that's a **merge-triggered** GitHub Action running `skillrig index`."* §9 line 307 says *"On release, `internal/index` walks `skills/*/skill.toml`"* — note the *wording* "on release" is slightly stale vs. the actual merge trigger, but the **traversal is still HEAD-tree**, never tag-history. The merge trigger is the operative reality. + +**Conclusion:** the v0 catalog is, by construction, **single-tip / full-regenerate / no-GC**. The infrastructure to do cross-ref aggregation does not exist and was never built. + +### Finding 2: The version in the catalog comes from the manifest, decoupled from the git tag — one version per skill + +The committed `index.json` carries `"version": "1.4.0"` for `terraform-plan-review`. Where does that number come from, and does it track tags? + +- **Source = the manifest field**, not the git tag. `build-index.sh:21` greps `^version` out of the skill manifest (post-S1: `metadata.x-skillrig.version` in frontmatter). The skill's own `skill.toml`/frontmatter declares `version = "1.4.0"`. +- **release-please keeps that field in sync with the tag** but they are *separate stores*. `.release-please-manifest.json` records `"skills/terraform-plan-review": "1.4.0"`; `release-please-config.json` uses `include-component-in-tag: true` + `tag-separator: "-"` → merging a release PR cuts the prefixed tag `terraform-plan-review-v1.4.0` (`.skillrig-origin.toml` `tag_scheme = "name-vSEMVER"`) **and** bumps the version inside the skill's manifest in the same PR. So at HEAD, the manifest version == the latest tag's version, by release-please's own bookkeeping. +- **Therefore the catalog carries exactly ONE version per skill: the HEAD version**, which equals the newest released tag. It does **not** enumerate `1.3.0`, `1.2.0`, … — those live only as git tags (`terraform-plan-review-v1.3.0`), never in `index.json`. + +This is the crux for the aggregation question: **versions are not aggregated; the catalog shows the current released version of each skill that exists at HEAD.** + +### Finding 3: Cross-ref tag aggregation — the right v0 model is single-tip, and `--pin` does NOT need the catalog (comment 8e05b856) + +The load-bearing question: should the catalog aggregate skills/versions across older refs/releases, so `search` can show multiple versions of a skill, or a skill removed at HEAD but present at an older tag? + +**Decision: NO cross-ref aggregation for v0. The catalog reflects the current tip only.** Rationale, grounded: + +1. **`search` is discovery, `add --pin` is acquisition — and they use different planes.** spec-tech.md §2 line 28 already nails the catalog as *"discovery-only"* with *"no per-skill treeSha or commit."* §5 / D-pin: a `--pin v1.4.0` resolves to the git tag `terraform-plan-review-v1.4.0` and `add` fetches **that tag's subtree directly** (S4's `git clone --sparse` at the tag), recording commit + computed treeSha. **The pin path never consults the catalog for the old version** — git tags *are* the version-history index. So "search shows v1.4.0, user pins v1.3.0" works fine: search surfaces the skill's existence + current version; the user pins any tag they know; `add` fetches the tag. The catalog does not need older versions for `--pin` to function. + +2. **Cross-ref aggregation would require walking tag history at index time** — `git tag --list 'terraform-plan-review-v*'`, checking out or `git show`-ing each tag's manifest, and merging. That is a categorically different (and much heavier) generator than "walk the HEAD tree," would make `index.json` grow unbounded with release history (the GC problem from comment ef449651 becomes real), and would couple the catalog's correctness to tag-naming discipline across all of history. It buys only a *browse-old-versions* UX that git tags + `gh release list` already provide. + +3. **Removed-at-HEAD skills SHOULD disappear from search.** If an org deletes a skill from the default branch, `search` *not* listing it is correct — it signals "deprecated/withdrawn, don't adopt." A consumer who already vendored it keeps working (their lock has commit+treeSha; `verify` is offline and does not need the catalog). Re-acquiring a withdrawn skill by exact `--pin ` still works against the tag. So single-tip loses nothing a consumer needs, and gains "the catalog reflects what the org currently endorses." + +4. **`skillrigConvention` is a single scalar at catalog root**, not per-version. A cross-ref catalog spanning refs with *different* convention versions would have no coherent root convention — another sign aggregation fights the design. + +**Net:** the catalog answers *"what skills does this origin offer right now, and at what current version,"* keyed by `OWNER/REPO@ref` = a branch tip (architecture §9b identity grammar). Version *history* lives in git tags and is reached via `--pin`, not the catalog. This is the correct v0 model and it is also the *only* model the existing infrastructure implements. + +### Finding 4: Append-only vs full-regenerate vs GC (comment ef449651) — full-regenerate, GC is YAGNI + +Given single-tip (Finding 3), this question largely dissolves: + +- **Full-regenerate is correct and is what exists.** Each merge to `main` rebuilds `index.json` from the HEAD tree and commits if changed (Finding 1). The catalog is a pure function of the HEAD tree: `index.json = f(skills/*/SKILL.md frontmatter at HEAD)`. Deterministic, reproducible, no accumulated state. +- **Append-only is wrong here.** Appending (accumulating every version ever released into the catalog) is the cross-ref model rejected in Finding 3 — it reintroduces unbounded growth and the GC problem. As release-please cuts `terraform-plan-review-v1.5.0`, the *right* behavior is: the release PR bumps the manifest version to `1.5.0`, the merge re-triggers the index workflow, and the catalog's single row for that skill flips `version: 1.4.0 → 1.5.0`. The old row is *replaced*, not appended. +- **GC is YAGNI for v0.** With full-regenerate from HEAD, there is nothing to garbage-collect: stale entries can't accumulate because the file is rebuilt wholesale each time. GC only becomes a concept under an append-only/aggregated catalog — which we're not building. Record GC as explicitly out-of-scope, revisitable only if a future feature wants a version-history catalog. + +One sequencing nuance worth noting in the contract: the **release PR merge** (which bumps the manifest version) and the **index regeneration** are both `push: main` events, and the index workflow's `paths: skills/**` filter *does* fire on a release-please version bump to a skill's manifest (the bump edits a file under `skills/`). So the catalog stays consistent with released versions automatically — no separate "on release" hook needed. (This also means the §9 "on release" wording should be corrected to "on merge to main" — FR-024 doc reconciliation.) + +### Finding 5: Scope — build `skillrig index` IN 003, do not roadmap it as a sibling + +This is the decisive call. Weighing it against the architecture's single-implementation rule and the existing seam: + +**Arguments that `skillrig index` belongs in 003:** + +1. **The catalog MUST be generatable for `search` to be meaningful, and skillrig is the origin tool.** The spike's own framing flags "consume-only + roadmap a generator" as a *false economy*. `search` reads `index.json`; if `index.json` can only be produced by a documentation-grade bash script (`build-index.sh`) that **provably drifts** (it emits only `name/version/description/path`, dropping `tags`, `namespace`, `requires` — spec-tech.md §2 line 30; confirmed at `build-index.sh:25`), then `search --tag` has no trustworthy data source. FR-023 *requires* reconciling the generator with what `search` consumes. The cheapest correct way to reconcile is to make the **authoritative generator real**, not to patch the bash fallback to also emit tags (which just moves the drift risk). + +2. **AP-04 single-implementation makes it nearly free.** Architecture §2 (line 88) and §9, and `index.yml`'s own comments, state `skillrig index` *"shares skillcore/manifest parsing with verify/bump so values can't diverge."* Post-S1, 003 is **already** rewriting `pkg/skillcore`'s manifest parser to read `SKILL.md` frontmatter (S1 commit 1: replace `ParseManifest`). `skillrig index` is then: *walk `skills/*/SKILL.md`, call the same `ParseManifest`, marshal to the catalog JSON shape.* The parser — the hard, shared part — is being built in 003 regardless. The generator is a thin walk+marshal on top, in the same package, satisfying the single-impl rule *by construction*. Splitting it into a sibling feature would mean either (a) the sibling re-implements/duplicates frontmatter parsing (AP-04 violation) or (b) the sibling can't start until 003's parser exists anyway — so the coupling argues for *together*, not *apart*. + +3. **The seam already expects the binary.** `index.yml` (lines 36-40) already branches `if command -v skillrig … skillrig index --out index.json … else ./scripts/build-index.sh`. The CI contract is *written for* the binary; the bash script is explicitly the *"legible fallback for environments without the binary"* (`build-index.sh:1-6`). Shipping `skillrig index` lights up the path the origin template was authored against and lets FR-023 retire/demote the drifting fallback. + +4. **It closes the FR-023 drift at the source.** With `skillrig index` authoritative, the committed `index.json` and the generator can't disagree — they're the same code. FR-023 becomes "ship `skillrig index` + a contract test asserting `index.json == skillrig index` over the origin fixture," which is *stronger* than "reconcile a bash script's output by hand." + +**The one argument for deferral** — "003 is consumer-side (`search`+`add`), `index` is origin-side, keep slices small" — is outweighed because: the consumer (`search`) is *useless without a correct catalog*, and the generator shares 003's brand-new parser. Deferring `index` means 003 ships a `search` that reads a catalog only a drifting bash script can produce — i.e. ships the consumer half of a contract whose producer half is known-broken. That is the false economy the framing warned about. + +**Scope decision: `skillrig index` lands in 003** as a thin `pkg/skillcore` generator (`GenerateCatalog(skillsDir) → Catalog`) + a `skillrig index --out` command, sharing the S1 frontmatter parser with `add`/`verify`. FR-023's origin-template work becomes: (a) re-point `build-index.sh` at frontmatter as the *fallback* (keep it legible but demoted), (b) regenerate the committed `index.json` via `skillrig index`, (c) add a contract test. This keeps 003's slice coherent (the producer and consumer of the catalog ship together, against the same parser) rather than artificially small. + +*Bound the scope:* `skillrig index` ships **only** the single-tip/full-regenerate generator (Findings 1-4). No tag-history walking, no GC, no append. That keeps it small — it's a tree-walk + the parser 003 already has. + +### Finding 6: Reconciliation with S1 — generator reads `metadata.x-skillrig.*` via the shared parser (AP-04) + +Per S1's D-S1-catalog-source, `skillrig index` produces each catalog row by: +- `name`, `description` ← standard frontmatter top-level fields. +- `version`, `namespace`, `tags`, `requires` ← `metadata.x-skillrig.*` (S1 Option A: `tags` space-string → split to `[]`; `requires` nested list-of-maps). +- `path` ← the skill's directory (`skills/`). +- Catalog root: `skillrigConvention` ← `.skillrig-origin.toml` `convention_version`; `origin` ← its `origin` field. + +Critically, `skillrig index` calls the **same** `ParseManifest(skillDir)` that `add` (vendoring) and `verify` (prereq read) call — one frontmatter parse implementation in `pkg/skillcore`, three callers. This is the §9/§2 "values can't diverge" guarantee made literal and is exactly the AP-04 discipline S1 set up. No second parser, no bash-grep of YAML in production (the fallback's grep is acknowledged-lossy and demoted). + +## Decisions + +- **D-S2-tip — the catalog is SINGLE-TIP, not cross-ref aggregated.** `index.json` at `OWNER/REPO@ref` reflects the HEAD tree of that branch: the skills that exist there, each at its current (HEAD == latest-released-tag) version. It does **not** enumerate older versions or skills removed at HEAD. Version *history* lives in git tags and is reached by `add --pin ` fetching the tag subtree directly — the catalog is never the version-history index (Findings 2, 3). This matches the only model the origin's existing `index.yml`/`build-index.sh` implement. +- **D-S2-regen — FULL-REGENERATE on merge to `main`; GC is YAGNI.** Each merge that touches `skills/**` rebuilds `index.json = f(HEAD frontmatter)` and commits iff changed. Not append-only (that's the rejected aggregated model). Nothing accumulates, so there is nothing to garbage-collect; record GC as out-of-scope, revisit only if a version-history catalog is ever wanted (Findings 1, 4). release-please version bumps land under `skills/**`, so they auto-retrigger the index and keep catalog versions == released tags with no extra hook. +- **D-S2-scope — `skillrig index` SHIPS IN 003** (not a sibling/roadmap item), as a thin `pkg/skillcore` generator + `skillrig index --out` command sharing S1's frontmatter parser with `add`/`verify` (AP-04). Rationale: `search` is useless without a non-drifting catalog; FR-023 demands reconciling the generator; the hard part (the frontmatter parser) is already being built in 003 by S1; and `index.yml` is already authored to call the binary. Deferring would ship a consumer against a known-drifting producer — the false economy the framing named. Scope-bounded to the single-tip/full-regenerate generator only (no tag-history, no GC). (Finding 5.) +- **D-S2-source — generator field-source per S1.** `name`/`description` ← standard frontmatter; `version`/`namespace`/`tags`/`requires` ← `metadata.x-skillrig.*`; `path` ← directory; root `skillrigConvention`/`origin` ← `.skillrig-origin.toml`. One shared `ParseManifest`, three callers (Finding 6). +- **D-S2-fallback — demote, don't delete, `build-index.sh`.** Re-point it at `SKILL.md` frontmatter so the contract stays legible for binary-less environments, but `skillrig index` is authoritative and the committed `index.json` is produced by it. The grep-based bash path is acknowledged-lossy (can't robustly parse nested `x-skillrig.requires`); it may legitimately emit a reduced schema as a *documented* fallback, with `skillrig index` as the full-fidelity producer. + +## Recommendations + +1. **Put `skillrig index` in 003's plan as a first-class command** (Query-adjacent / Environment-write pattern — it writes the origin's `index.json`; classify per cli.md and run the checklist). Implement `pkg/skillcore.GenerateCatalog(skillsDir, originCfg) (Catalog, error)` that walks `skills/*/SKILL.md`, calls the shared `ParseManifest`, and assembles the catalog struct; `internal/cli` adds `skillrig index [--out index.json] [--json] [--verbose]` rendering/writing it. +2. **Define the FR-023 catalog contract** (below) as the convention-1 catalog schema, and add a contract/ground-truth test: `skillrig index` over the origin fixture must equal the committed `index.json` (the producer==artifact guarantee), analogous to S4's `fetched treeSha == raw git tree-SHA` oracle-independence. +3. **Fix the origin template in-branch (FR-023):** re-point `build-index.sh` at frontmatter (demoted fallback), regenerate the committed `index.json` via `skillrig index`, and add the convention-1 catalog-schema note to `docs/CONVENTION.md`. +4. **Correct stale docs (FR-024):** architecture §9 line 307 says "On release, `internal/index` walks `skills/*/skill.toml`" — update to "on **merge to main**, `skillrig index` walks `skills/*/SKILL.md` frontmatter." Note explicitly: catalog is single-tip; version history is git tags, reached via `--pin`. +5. **State the single-tip model in the `search` UX:** `search` shows the current offered version; document that `add --pin ` is how to get a specific/older version (it fetches the tag, not the catalog). This keeps users from expecting `search` to be a version browser. +6. **Record GC + cross-ref aggregation as explicit non-goals** in spec.md/spec-tech.md so a later reviewer doesn't reopen them; the trigger to revisit is "a feature needs a version-history/browse-all-versions catalog," which v0 does not. + +## The concrete FR-023 catalog contract (convention 1) + +`index.json` at origin repo root. **Single-tip, full-regenerate, no per-skill treeSha/commit (discovery-only).** Produced by `skillrig index` (authoritative) / `build-index.sh` (demoted fallback) on merge to the default branch. + +```jsonc +{ + "skillrigConvention": 1, // ← .skillrig-origin.toml convention_version (binary gates on this) + "origin": "my-org/my-skills", // ← .skillrig-origin.toml origin (OWNER/REPO identity, §9b) + "skills": [ + { + "name": "terraform-plan-review", // ← SKILL.md frontmatter `name` (== dir name) + "version": "1.4.0", // ← metadata.x-skillrig.version (== latest released tag at HEAD) + "namespace": "my-org", // ← metadata.x-skillrig.namespace (optional for search) + "description": "Review a terraform plan ...", // ← SKILL.md frontmatter `description` + "tags": ["platform-team","terraform","aws"], // ← metadata.x-skillrig.tags (space-string → split); REQUIRED for --tag + "path": "skills/terraform-plan-review", // ← skill directory (relative to repo root) + "requires": [ // ← metadata.x-skillrig.requires (carried; not required by search) + { "tool": "oxid", "version": ">=0.4.0", "source": "my-org/my-skills" }, + { "tool": "terraform", "version": ">=1.6", "source": "hashicorp/terraform" } + ] + } + ] +} +``` + +**Contract invariants:** +- The catalog is a **pure function of the HEAD tree's `SKILL.md` frontmatter** + `.skillrig-origin.toml`. Reproducible: `skillrig index` over the same tree yields byte-identical (modulo key order) output ⇒ the contract test compares structurally. +- **`search` consumes:** per-skill `name`, `version`, `description`, `tags[]`, `path`; root `skillrigConvention`, `origin`. `namespace`/`requires` are carried but optional for `search` this slice (spec-tech.md §2 line 33). +- **One version per skill** = the HEAD/current version. No version arrays, no historical rows (D-S2-tip). +- **`skillrigConvention` is gated by the binary** before `search`/`add` act (§4 convention gate, FR-016); a mismatch is the "update skillrig" error class. +- **No `treeSha`/`commit` per skill** — the catalog is discovery-only; integrity anchors are computed at `add` time and recorded in the consumer's lock (spec-tech.md §2 line 28, §5). + +## References + +- Origin index workflow (trigger=push:main, paths skills/**; full-regenerate + commit-if-changed; calls `skillrig index` else fallback) — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/.github/workflows/index.yml`. +- Origin fallback generator (walks `skills/*/skill.toml`, emits reduced `name/version/description/path` — the FR-023 drift) — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/scripts/build-index.sh:18,21,25`. +- Committed catalog (single-tip shape, `generated` banner, full schema incl. tags/requires) — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/index.json`. +- Origin convention/contract (convention_version=1, origin, skills_dir, `tag_scheme = "name-vSEMVER"`) — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/.skillrig-origin.toml`. +- release-please config (per-skill prefixed tags: `include-component-in-tag`, `tag-separator '-'` → `terraform-plan-review-v1.4.0`) — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/release-please-config.json`; manifest `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/.release-please-manifest.json`. +- Release workflow (release-please cuts prefixed tags on push:main; goreleaser only for `oxid-` CLI tags) — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/.github/workflows/release.yml`. +- Sample skill manifest (version 1.4.0 in `skill.toml`; frontmatter already has name/description — the S1 drift) — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/skills/terraform-plan-review/{skill.toml,SKILL.md}`. +- Architecture — `docs/ARCHITECTURE-v0.md` §2 line 88 (index = merge-triggered, only write, runs `skillrig index`), §9 lines 305-309 ("on release" wording stale; walks skills, discovery-only, GH-Pages dropped), §9b line 315 (`OWNER/REPO[/path]@ref` identity grammar, catalog as one of three consumers), §13 roadmap (catalog in v0). +- Upstream S1 (field-source: frontmatter + `metadata.x-skillrig.*`; drop `skill.toml`; AP-04 single parser) — `specledger/003-search-remote/research/2026-05-31-skill-manifest-format.md`. +- Framing — `specledger/003-search-remote/spec-tech.md` §2 (catalog discovery-only, build-index.sh drift = FR-023), §8b (Spike S2), §9 (FR-023/FR-024 co-evolution). diff --git a/specledger/003-search-remote/research/2026-05-31-remote-git-testing.md b/specledger/003-search-remote/research/2026-05-31-remote-git-testing.md new file mode 100644 index 0000000..7bb57a8 --- /dev/null +++ b/specledger/003-search-remote/research/2026-05-31-remote-git-testing.md @@ -0,0 +1,244 @@ +# Research: Remote Git Testing + +**Date**: 2026-05-31 +**Context**: 003-search-remote introduces real remote git fetch (catalog and skill subtree) for `search` and `add`. 002 had zero network boundary — all test substrate was local git working trees. This spike determines the test architecture for the new network-failure FRs (FR-017 auth, FR-018 unreachable, transient/timeout) and the happy-path fetch path. +**Time-box**: ~45 min (code archaeology + analysis) + +--- + +## Question + +What test substrate exercises the new network-failure FRs (FR-017 auth failure, FR-018 unreachable, and transient/timeout errors) for remote `add`/`search`, and what substrate covers the happy fetch path with a ground-truth tree-SHA assertion? + +--- + +## Findings + +### Finding 1: How 002 bootstraps the origin and stubs git (ground truth from code) + +**Integration tests** (`test/skillcore_quickstart_test.go`, `test/quickstart_test.go`): + +002 bootstraps the origin by: +1. Copying the committed fixture at `test/testdata/sample-origin` into a fresh `t.TempDir()`. +2. Running `git init -q -b main` + `git add -A` + `git commit` in that tmpDir using a **pinned identity** (fixed `GIT_AUTHOR_NAME/EMAIL/DATE` for reproducible commit SHAs — `pinnedGitEnv()` in both the integration and unit helpers). +3. Nesting the origin at `/my-org/my-skills` (the `OWNER/REPO` path the resolver maps to a local directory via `originDirRef`). +4. The CLI binary is built once in `TestMain` and exec'd via `run()`. There is **no `file://` URL, no `git clone`, no network layer** — `skillcore.Add` receives `OriginDir` as a plain filesystem path and runs `git -C ` directly. + +The independent oracle (`rawTreeSHA`) reads the tree-SHA with `git rev-parse HEAD:` directly on the origin dir — never through skillcore — to prevent circular validation (Constitution III / D11). + +**Unit tests** (`pkg/skillcore/helpers_test.go`, `treesha_test.go`): + +`bootstrapOrigin(t)` does the same pattern in-process: `git init` + write fixture files + commit in a `t.TempDir()`. There is no mock of git itself for the happy path — the real `git` binary runs against a real (ephemeral) repo. + +The **stub seam** (`stubCommandContext` + `TestHelperProcess`) is how error paths are exercised without real git: +- `gitClient.commandContext` is a pluggable field (`func(ctx, name, args) *exec.Cmd`). +- `stubCommandContext(exitCode, stderr)` returns a `commandContext` that re-execs the test binary with `GO_WANT_HELPER_PROCESS=1`, causing `TestHelperProcess` to write the given stderr and call `os.Exit(exitCode)`. +- This produces a real `*exec.ExitError` (not a mock interface), so `gitClient.run`'s error-wrapping path is exercised exactly as in production. +- Used in `TestGitClient_StubbedExit` to assert `*GitError{ExitCode, Stderr}` is populated correctly for exit 1, 128, etc. + +**Key fact**: the stub seam lives entirely in `pkg/skillcore` and targets the `gitClient.commandContext` field. It does **not** stub network or HTTP — it stubs the subprocess exit. + +--- + +### Finding 2: Can `file://` (or a local bare repo) cover the happy/integrity fetch path? + +**Yes, with confidence.** Git's `file://` transport (and the `file://` URL variant of `git clone`/`git fetch`) uses the same plumbing as HTTPS but against a local bare repo. It exercises: +- The real `git clone --filter=blob:none --sparse` (partial clone) or `git fetch` code path. +- Sparse-checkout expansion of the skill subtree. +- The same `git rev-parse :` tree-SHA computation that `skillcore.TreeSHA` calls — the ground-truth assertion `fetched tree-SHA == raw git tree-SHA of origin subtree` holds because both call the same git plumbing on the same object graph. + +A **local bare repo** (`git init --bare`) is the cleanest substrate: +- No working tree to accidentally mutate. +- `git clone file:///path/to/bare.git` runs a real fetch handshake (not a cp/hardlink shortcut like a plain path clone without `file://`). +- Fully offline and deterministic. Push to it from the fixture working tree, then clone from it in the test. + +Bootstrap pattern for 003 integration tests: +```go +// 1. Create a fixture working tree (same as 002's bootstrapOrigin) +// and commit the sample skill + index.json. +// 2. Create a bare repo alongside it: +// git init --bare /origin.git +// 3. Push the fixture into the bare: +// git -C push file:///origin.git HEAD:main +// 4. Supply the origin URL to the CLI/skillcore as "file:///origin.git" +// (or the OWNER/REPO config pointing at a path-shaped origin). +// 5. rawTreeSHA: git -C rev-parse HEAD: +// (still never through skillcore — the same D11 independence). +``` + +The `file://` URL exercises exactly the fetch path that HTTPS will use in production — only the transport layer changes. This is how projects like `go-git`, `gh`, and `git` itself test their fetch logic offline. + +**Confidence: high** — this is well-established in the Go ecosystem (see `go-git` test suite, `gh` CLI integration tests). + +--- + +### Finding 3: Which FRs cannot be simulated with `file://`? + +| FR | Failure class | `file://` / bare repo | Alternative needed | +|---|---|---|---| +| FR-017 | Auth failure (401/403, private origin, bad token) | Cannot simulate: `file://` has no auth layer | Fault injection at the exec boundary | +| FR-018 | Unreachable (network down, DNS failure, timeout) | Cannot simulate: `file://` always succeeds locally | Fault injection at the exec boundary | +| Transient | Timeout / partial failure mid-fetch | Cannot simulate reliably with `file://` | Fault injection at the exec boundary | +| FR-016 | Convention version mismatch | Fully simulable: write wrong `convention_version` in the bare origin's index.json | No HTTP needed | +| FR-015 | Skill not found / pin not found | Fully simulable with `file://` | No HTTP needed | +| FR-019 | Skill not in lock (verify after remote add) | Fully simulable | No HTTP needed | + +**Auth (FR-017)**: `file://` has no authentication layer. An auth failure from a real HTTPS origin surfaces as `git clone` exiting with code 128 and stderr like `fatal: Authentication failed for 'https://github.com/...'` or `fatal: could not read Username`. These cannot be reproduced offline without either (a) a real HTTPS server that returns 401, or (b) injecting that exit code + stderr at the exec boundary. + +**Unreachable (FR-018)**: A network-down/DNS failure surfaces as git exiting 128 with `fatal: unable to access 'https://...': Could not resolve host:` or `Operation timed out`. `file://` will never produce these. Same two options: real HTTP server or exec-boundary injection. + +**Transient / timeout**: Similar — these are OS-level TCP failures, not reproducible with `file://`. + +--- + +### Finding 4: Recommended test seam — exec boundary fault injection, not httptest + +**The recommended seam is the same `gitClient.commandContext` field already in `pkg/skillcore/git.go`**, extended to cover the new remote fetch operations (e.g., `git clone --filter=blob:none --sparse`, `git fetch`, `git ls-remote`). + +**Rationale against `httptest` with a real git-over-HTTP handshake**: + +1. **Complexity / fragility**: A real git-over-HTTP server requires implementing the git smart-HTTP protocol (the `info/refs` + `upload-pack` handshake). This is non-trivial — `git http-backend` is a CGI binary, not a Go library. Wrapping it in `httptest` means shelling `git http-backend` from a Go test, which: + - Requires `git` to be in PATH (already a requirement, but adds the CGI dep). + - Is OS-specific in behavior (especially on macOS where `git http-backend` may not exist separately from the git bundle). + - Produces opaque failures when the handshake version or capability negotiation changes across git releases. + +2. **Exec-boundary injection proves the right thing**: What FR-017/018 tests need to assert is: "when the git subprocess fails with exit 128 and an auth-failure stderr, skillrig surfaces a distinct `AuthError` (not `UnreachableError`, not `SkillNotFoundError`) with the right what/why/fix message." That contract is fully testable by injecting the right `(exitCode, stderr)` pair via `stubCommandContext`. The CLI's error-mapping code is the thing under test, not git's HTTP stack. + +3. **`file://` is the right boundary for happy-path integration**: The integration suite (`TestQuickstart_*`) should use a local bare repo over `file://` for the full stack — binary exec'd, real `git clone`, real tree-SHA. The ground-truth assertion (`fetched treeSHA == rawTreeSHA of the origin fixture`) closes the loop across the fetch boundary. + +4. **Nothing genuinely requires a live HTTP layer in the test gate**: The only thing not coverable is "does git actually fail with exit 128 on a real 401 response?" — but that is a git behavior test, not a skillrig behavior test. Skillrig's contract is to correctly classify and render the error type; the classification is driven by the exit code and stderr pattern, both of which the stub can supply exactly. + +**Proposed seam architecture for 003**: + +``` +pkg/skillcore/ + git.go — extend gitClient with remote ops: + Clone(ctx, remote, dest, opts) (string, error) + LsRemote(ctx, remote, ref) (string, error) // resolve commit SHA for a ref/pin + FetchSparse(ctx, remote, ref, subtree, dest) error + All dispatch through gitClient.commandContext — the single injectable point. + errors.go — add typed errors: + AuthError{Remote, Stderr string} // exit 128 + auth-failure stderr pattern + UnreachableError{Remote, Stderr string} // exit 128 + DNS/timeout stderr pattern + ConventionError{Got, Supported int} // FR-016 + fetch.go — the new remote acquisition layer: + Fetch(ctx, opts FetchOptions) (FetchResult, error) + FetchOptions carries the remote URL, ref, subtree path, dest dir. + Returns the fetched commit SHA + the tree-SHA (computed after fetch). + Called by Add when the origin is remote (not a local path). +``` + +**Unit test pattern (fault injection)**: +```go +func TestFetch_AuthFailure(t *testing.T) { + c := &gitClient{commandContext: stubCommandContext(128, + "fatal: Authentication failed for 'https://github.com/private/repo.git'")} + _, err := c.Clone(context.Background(), "https://github.com/private/repo.git", t.TempDir(), CloneOptions{}) + var authErr *AuthError + if !errors.As(err, &authErr) { + t.Fatalf("want *AuthError, got %T: %v", err, err) + } +} + +func TestFetch_Unreachable(t *testing.T) { + c := &gitClient{commandContext: stubCommandContext(128, + "fatal: unable to access 'https://github.com/foo/bar.git/': Could not resolve host: github.com")} + _, err := c.Clone(context.Background(), "https://github.com/foo/bar.git", t.TempDir(), CloneOptions{}) + var unreachErr *UnreachableError + if !errors.As(err, &unreachErr) { + t.Fatalf("want *UnreachableError, got %T: %v", err, err) + } +} +``` + +**Integration test pattern (`file://` bare repo)**: +```go +func newRemoteConsumerRepo(t *testing.T) (consumerRepo, string /* originURL */) { + // 1. Fixture working tree (as in bootstrapOrigin) + fixtureDir := bootstrapFixtureOrigin(t) // same as 002 + // 2. Bare repo + bareDir := filepath.Join(t.TempDir(), "origin.git") + runGit(t, fixtureDir, "init", "--bare", bareDir) // or: git init --bare bareDir + runGit(t, fixtureDir, "push", "file://"+bareDir, "HEAD:main") + originURL := "file://" + bareDir + // 3. Consumer repo with SKILLRIG_ORIGIN pointing at the bare repo URL + root := t.TempDir() + runGit(t, root, "init", "-q", "-b", "main") + // run skillrig init --origin ... + return consumerRepo{root: root, originDir: fixtureDir}, originURL +} + +func TestQuickstart_RemoteAdd(t *testing.T) { + c, originURL := newRemoteConsumerRepo(t) + wantTreeSHA := rawTreeSHA(t, c.originDir, "HEAD", "skills/"+sampleSkill) + res := run(t, runOpts{ + args: []string{"add", sampleSkill}, + cwd: c.root, + env: map[string]string{"SKILLRIG_ORIGIN": originURL}, + }) + // ... assert exit 0, treeSHA in lock == wantTreeSHA +} +``` + +**Error-class integration tests** (these use the stub path, not `file://`): +```go +// Auth failure: inject the stub at the skillcore boundary, +// call the CLI's add command directly (in-process, not exec-of-binary), +// or provide a fake binary wrapper that exits 128 with the right stderr. +``` + +For the integration (exec-of-binary) tier, the auth/unreachable tests can use a small **shim binary** (compiled once in `TestMain` alongside the real binary) that simply exits with a given code and writes a given stderr — or use `SKILLRIG_GIT_BIN` env to point the CLI at a fake git binary. However, the simpler path is to test auth/unreachable error classification as **unit tests** against the `pkg/skillcore` fetch layer directly (in-process, no binary exec), and test the CLI rendering of those errors via `internal/cli` unit tests (which call `addCmd.run` directly with a stub `gitClient`). + +--- + +### Finding 5: What genuinely cannot be covered offline? + +1. **GitHub's actual auth handshake**: Verifying that `gh auth token` or `GIT_ASKPASS` is correctly plumbed to git for a real private repo. This requires a real GitHub call and belongs in a manual/CI E2E test tier, not the offline gate. + +2. **Partial-fetch abort mid-stream**: A real TCP reset mid-clone cannot be deterministically reproduced with `file://` or exec stubs. However, git's behavior on abort (non-zero exit, stderr describing the failure) is stable, and the exit-code/stderr injection covers skillrig's handling of it. + +3. **Rate limiting (HTTP 429)**: Would require a real HTTP server. Not worth building for the offline gate; document as a future E2E concern. + +None of these block the correctness of the offline suite. The offline suite can provide full coverage of skillrig's own error classification, rendering, and exit-code logic. + +--- + +## Decisions + +- **Decision 1: `file://` + local bare repo for the integration happy path.** The `TestQuickstart_*` suite bootstraps a bare repo in a `t.TempDir()`, pushes the fixture into it, and points the CLI at it via `file://` URL. This runs the real fetch path (sparse-checkout, commit resolution, tree-SHA computation) entirely offline and deterministically. The ground-truth assertion (`fetched treeSha == rawTreeSHA of fixture origin`) closes the integrity loop across the fetch boundary. + +- **Decision 2: Exec-boundary fault injection (extended `gitClient.commandContext` stub) for FR-017/018 unit tests.** Auth, unreachable, and transient error paths are tested as `pkg/skillcore` unit tests using the existing `stubCommandContext` pattern with crafted `(exitCode=128, stderr=)` pairs. This is deterministic, offline, and proves skillrig's error-type discrimination and rendering without needing an HTTP server. + +- **Decision 3: No `httptest` with a real git-over-HTTP handshake in the offline gate.** The complexity and fragility (CGI binary, OS-specific `git http-backend`, protocol negotiation across git versions) is not justified. The seam at the exec boundary is sufficient for all FRs in the spec. + +- **Decision 4: Typed errors for each failure class in `pkg/skillcore/errors.go`.** `AuthError`, `UnreachableError`, and `ConventionError` (FR-016) must be distinct types (not string matching on `GitError.Stderr` in the CLI). The CLI maps each to a `*UsageError` with the right what/why/fix. The stderr pattern matching (auth vs unreachable) lives inside `pkg/skillcore`'s fetch layer, not in `internal/cli`. + +- **Decision 5: Stderr-pattern discrimination lives in `skillcore`, not `cli`.** `fetch.go` inspects `GitError.Stderr` to classify the failure before returning to the CLI. This keeps the CLI presentation-free (it receives a typed error, not a raw git stderr). The pattern strings (e.g., `"Authentication failed"`, `"Could not resolve host"`) are internal to skillcore and tested by the unit suite. + +--- + +## Recommendations + +1. **Extend `pkg/skillcore/git.go`** with a `Clone`/`FetchSparse` method on `gitClient`, dispatching through the existing `commandContext` field. No new seam needed — the seam already exists. + +2. **Add `AuthError` and `UnreachableError` to `pkg/skillcore/errors.go`**, classified from `GitError.Stderr` inside the fetch layer. Write unit tests using `stubCommandContext` with real-world git stderr strings for each class (copy from actual git output). + +3. **Bootstrap a `newRemoteConsumerRepo(t)` helper in `test/`** that creates a bare repo, pushes the fixture, and returns the `file://` URL. All `TestQuickstart_RemoteAdd*` scenarios use it. The `rawTreeSHA` oracle pattern is unchanged. + +4. **Do not add `httptest`** to the test dep graph for this slice. Note it as a possible future tier for E2E auth smoke tests against a real GitHub App installation. + +5. **Catalog fetch** (for `search`): the same `file://` bare repo can host an `index.json` in its root, fetched via a single `git show :index.json` or sparse-checkout. The auth/unreachable error paths for catalog fetch use the same stub pattern as skill fetch. + +6. **`--pin` / ref resolution** (US3): `git ls-remote file:///path/to/origin.git ` returns the commit SHA for a tag — this too runs through `gitClient.commandContext` and is stubbable for "no such ref" errors. + +--- + +## References + +- `pkg/skillcore/git.go` — the `gitClient` struct and `commandContext` seam (lines 1–99). +- `pkg/skillcore/helpers_test.go` — `stubCommandContext`, `TestHelperProcess`, `bootstrapOrigin` (the full stub + fixture pattern). +- `pkg/skillcore/treesha_test.go` — `TestGitClient_StubbedExit`, `TestTreeSHA_GroundTruth`, `TestTreeSHA_RelocationInvariance` (the existing stub-seam usage + ground-truth discipline). +- `test/skillcore_quickstart_test.go` — `newConsumerRepo`, `bootstrapOrigin`, `rawTreeSHA`, `pinnedGitEnv` (the integration bootstrap pattern, lines 139–228). +- `specledger/003-search-remote/spec-tech.md` §6 (failure taxonomy), §7 (test tier decision), §8b S4 (this spike's brief). +- `go-git` test suite (upstream reference for `file://` bare-repo integration patterns): `https://github.com/go-git/go-git/tree/main/plumbing/transport/file` +- `gh` CLI git client pattern: `https://github.com/cli/cli/blob/trunk/git/client.go` (the `commandContext`-injectable field this codebase mirrors, per research D7 in helpers_test.go). diff --git a/specledger/003-search-remote/research/2026-05-31-search-index-architecture.md b/specledger/003-search-remote/research/2026-05-31-search-index-architecture.md new file mode 100644 index 0000000..261f897 --- /dev/null +++ b/specledger/003-search-remote/research/2026-05-31-search-index-architecture.md @@ -0,0 +1,154 @@ +# Research: Search Index Architecture — query semantics, index storage, deps, and topic-vs-tag + +**Date**: 2026-05-31 +**Context**: Spike S5 for `003-search-remote`. The user is revising the `search` design: `skillrig search [QUERY]` should treat the positional arg as a **query string matched against a skill's name + description** (from frontmatter), not only label filtering; and "tag" terminology collides with **git tags** (skillrig pins versions via git tags, scheme `name-vSEMVER`), so the label concept should likely be renamed **topic**. New deps for indexing/searching are PRE-APPROVED. This spike fixes: (1) query matching semantics over name+description(+topics); (2) whether flat `index.json` + in-memory filter is sufficient (YAGNI / crossover scale); (3) git-friendly index storage; (4) the Go library call (named lib or stdlib-only); (5) the topic↔tag terminology call; (6) where it runs + whether the catalog needs a new field. +**Time-box**: ~45 min +**Confidence**: HIGH on (1)(2)(3)(5)(6) — grounded in the resolved S1/S2 model, the real origin catalog, and N6/architecture constraints. HIGH on (4) — verified each candidate lib's source for cgo/binary-index/determinism. + +## Question + +What is the right v0 search algorithm and index architecture for `skillrig search [QUERY] [--topic T ...]`, given the binding constraints: **N6 deterministic + offline + no LLM/vector/embedding/semantic ranking**; **single static no-cgo binary**; **git-friendly (text, not binary) committed index**; **consume-only, no service**; **single `skillcore` / YAGNI**? And: rename label→topic? Does the catalog need a new field? + +## Findings + +### Finding 1: Real scale is tiny — a PRIVATE org origin is tens to low-hundreds of skills + +The catalog is **single-tip** (S2 / D-S2-tip): exactly **one entry per skill that exists at the origin's HEAD**, never a version-history index. So the catalog row count == number of *distinct skills an org currently publishes*, not skills × versions. + +Concrete anchors on scale: +- The real PoC origin (`skillrig-origin/index.json`) today has **1** skill. A mature private org library is realistically **tens, plausibly low-hundreds**, of skills — a private platform team's curated set, not a public marketplace. +- Each entry is small: `name` (≤64 chars), `description` (≤1024 chars), `version`, `namespace`, `tags`/topics (a handful of short strings), `path`, optional `requires`. Call it ~0.5–2 KB JSON/entry. **200 skills ≈ 100–400 KB of JSON.** +- For comparison, agentskills.io's *public* discovery is GitHub Code Search over `filename:SKILL.md` (gh's `skill search`, S1 Finding 4) — i.e. even the public ecosystem has **no catalog of this kind**; skillrig's catalog is a deliberately small, org-controlled artifact. + +**Implication:** at this scale, *loading the whole catalog into memory and linearly scanning every entry per query is trivially fast* — sub-millisecond for hundreds of entries, and the dominant cost of `search` is the network fetch of `index.json` (D-catalog-fetch: fetched per call), not the matching. There is **no data-structure problem to solve** at v0 scale. + +### Finding 2: The crossover where linear scan stops being adequate is ~10,000–100,000+ entries — orders of magnitude beyond reach + +A linear scan doing case-fold substring/token matching over name+description is O(N × |description|). At N=200 with ~1 KB descriptions that is ~200 KB of byte scanning per query — microseconds. Even a pathological N=5,000 is ~5 MB scanned — still single-digit milliseconds in Go, and still dwarfed by the git fetch. + +An **inverted index / trie / specialized search engine** only earns its keep when *either* (a) N is large enough that per-query linear scan is user-perceptible (rule of thumb: tens of thousands to millions of docs), *or* (b) the corpus is fetched once and queried repeatedly in a hot loop (a server). **Neither holds here:** N is ≤ low-hundreds, and `search` is a one-shot CLI invocation that fetches the catalog fresh each call (no persistent process, no repeated queries against a warm index). Building any index structure would be **pure YAGNI** — it would be slower end-to-end (build cost > scan saved) and add a dependency for zero benefit. + +**Crossover verdict:** load-JSON-and-filter is YAGNI-correct until the catalog reaches **~10k+ skills queried in a long-lived process** — a scale skillrig's single-tip private-org catalog will not approach. Record this so a future reviewer doesn't prematurely add an index. + +### Finding 3: Index storage — keep the flat `index.json` (option a). Reject a committed inverted index (b); the "ephemeral index at search time" (c) collapses into option a anyway + +The three options from the brief: + +- **(a) Flat `index.json` regenerated by `skillrig index` (S2) + in-memory filter at search time.** This is the *existing, resolved* model. `index.json` is a small, diff-friendly, human-reviewable JSON array — exactly the "git-friendly text artifact" the constraints demand. It is regenerated wholesale (full-regenerate, S2) so it never accumulates churn, and a PR reviewer can read a catalog diff. **This is the answer.** + +- **(b) A committed *prebuilt inverted index* (JSON/JSONL).** Even kept as text (so technically not a "binary blob"), this is wrong for v0: (i) it is a derived artifact that *duplicates* `index.json`'s content into a less-reviewable token→postings shape, doubling the committed surface and the drift risk (now TWO generated artifacts must agree); (ii) at ≤ low-hundreds of skills it indexes nothing worth indexing (Finding 2); (iii) it adds a generator + a consumer for a structure that saves microseconds. It violates YAGNI and AP-04-adjacent single-source discipline (two catalogs of truth). **Reject for v0.** + +- **(c) Generate an *ephemeral* index at search time from the fetched catalog.** This is just "load `index.json` into memory and (maybe) build a transient map before filtering." For ≤ hundreds of entries you don't even need the transient map — a direct slice scan is simpler and as fast. So (c) **degenerates into (a)** with at most a trivial in-memory pass; there is no separate artifact and nothing committed. Adopt the *plain* form: load `index.json` → filter the slice. No index structure, ephemeral or committed. + +**Storage decision:** keep the single committed `index.json` (option a), filtered in memory at search time. No second index artifact, committed or ephemeral. This is consistent with S2 (single-tip, full-regenerate, one catalog) and the git-friendliness constraint (one small reviewable JSON). + +### Finding 4: Query semantics — case-insensitive token-subsequence AND-match over name+description(+topics), with a deterministic score-then-name ordering + +This is the substantive design call. Requirements that bound it: +- **N6**: deterministic + offline; **no** LLM/vector/embedding/semantic ranking. If ranked, ranking MUST be a deterministic pure function of (query, catalog) with a stable tiebreak. +- **FR-002**: topic filtering is deterministic and repeatable, "**no inference or ranking by relevance**." +- **FR-002/A3 (original)**: multiple topics narrow by AND (a skill must carry all requested topics). +- The user's revision: `[QUERY]` matches against **name + description** (and reasonably topics too), not only labels. + +There is a real tension: FR-002 says topic *filtering* has "no ranking by relevance." That constraint is about the **`--topic` filter** (a pure set-membership predicate), and it should stay. The new free-text `[QUERY]` is a different axis — it *may* be ranked, but **only by a deterministic score**, never by inference. Recommended semantics, kept deliberately simple and fully deterministic: + +**Matching (the predicate — which skills appear):** +1. **Normalize**: lowercase (Unicode case-fold) both the query and the searched fields. Trim whitespace. +2. **Tokenize the query** on whitespace into terms `q1..qk`. +3. A skill **matches** iff **every** query term is a **substring** of the skill's *searchable text* = `name + "\n" + description + "\n" + join(topics," ")`. (Token-AND over a concatenated haystack — each term must appear *somewhere* in name OR description OR topics.) Substring (not whole-word) so `terra` matches `terraform`; AND across terms so `terraform aws` narrows. +4. **`--topic T` filters are a separate AND predicate** applied *on top*: exact-string, case-insensitive membership in the skill's `topics[]`; repeating `--topic` ANDs them (a skill must carry **all** requested topics) — preserving FR-002/A3 semantics unchanged, just renamed topic. +5. **Empty `[QUERY]` and no `--topic`** ⇒ everything matches (list all — Acceptance Scenario US1.1). **No match** ⇒ empty result, exit 0 (FR-004). + +**Ordering (deterministic, N6-safe):** results are sorted by a pure function of (query, entry): +- **Primary — a small integer relevance score**, highest first, computed *only* from where the query terms hit (no inference, no corpus statistics needed): + - exact `name == query` → score 3 (the user typed the skill's exact name); + - any term hits `name` (substring) → score 2; + - any term hits `topics` → score 1; + - terms hit only `description` → score 0. + (Take the max applicable bucket. This is a fixed lookup, not learned weights — deterministic and explainable in one sentence.) +- **Tiebreak — lexicographic by `name`** (stable, total order; names are unique per S2). This guarantees byte-identical ordering across runs (SC-002, determinism) regardless of input catalog order. +- With an **empty query** the score is uniform, so ordering is purely lexicographic by name — a stable, scannable listing. + +This is **deterministic by construction** (no randomness, no floats, no embeddings, no global IDF/TF that could shift as the catalog grows — score depends only on this entry + this query), satisfies N6, keeps `--topic` a pure filter (FR-002), and gives the free-text query a *useful* relevance order (exact/name matches first) without crossing into "inference." + +**Why not fuzzy (Levenshtein/Smith-Waterman/bitap)?** Deterministic *fuzzy* matching is technically N6-compatible (it's a pure function), and libs exist (Finding 5). But for v0 it is **over-engineering and a UX hazard**: fuzzy ranking over ≤ hundreds of short org skill names invites surprising/typo'd matches an agent then feeds to `add` (which then fails "not found"). Substring token-AND is predictable, explainable in the `--help`, and an exact `name==query` short-circuit already gives the "I know the name" path. Reserve fuzzy as a *post-v0* enhancement if users report misses; it is not needed to meet US1. + +### Finding 5: Go libraries — recommend **stdlib only** (`strings` + `slices`/`sort`). Every external candidate is unjustified at this scale + +New deps are pre-approved, so this is judged on merit against the constraints — and on merit the matching above is ~30 lines of stdlib. Candidates evaluated (import path · cgo/binary · determinism · verdict): + +| Library | Import path | cgo / binary index | Deterministic? | Verdict for v0 | +|---|---|---|---|---| +| **stdlib** | `strings`, `slices`, `sort` | none / none | yes (you control the sort) | **ADOPT.** `strings.ToLower`, `strings.Contains`, `strings.Fields`, `slices.SortFunc`. Zero new deps, fully deterministic, trivially testable. | +| **blevesearch/bleve** | `github.com/blevesearch/bleve/v2` | **persists a binary on-disk index**; large transitive dep tree (scorch, RoaringBitmaps, protobuf, boltdb). Pure-Go in-memory is possible but the whole point (a persisted inverted index) is git-hostile and unnecessary. | configurable but heavyweight | **REJECT.** Binary index = git-hostile (constraint), heavyweight for a one-shot CLI over ≤ hundreds of docs. Massive over-build. | +| **sahilm/fuzzy** | `github.com/sahilm/fuzzy` (v0.1.2) | **pure Go, zero deps, no cgo**; ranks descending by deterministic Sublime-style quality (first-char/camel/separator/adjacency) | yes, stable descending order | **DEFER (not v0).** The cleanest fuzzy option *if* fuzzy is ever wanted; deterministic and dependency-free. But fuzzy isn't needed for US1 (Finding 4), so adopting it now is YAGNI + the UX hazard above. Keep as the named post-v0 upgrade. | +| **lithammer/fuzzysearch** | `github.com/lithammer/fuzzysearch/fuzzy` | pure Go, no cgo | `RankFind` returns scores you `sort` — stable only if you add a tiebreak | DEFER/REJECT. Subsequence match (not substring) is *looser* than wanted (would match scattered letters); Levenshtein ranking needs a manual stable tiebreak anyway. No advantage over stdlib for v0. | +| **schollz/closestmatch** | `github.com/schollz/closestmatch` | pure Go, no cgo, n-gram bag-of-words | deterministic | REJECT. Tuned for *long/multi-word documents* (book titles); its own docs say it does **worse than Levenshtein on single-word/dictionary** inputs — i.e. exactly the short-name case skillrig has. Wrong tool. | + +**Dependency recommendation: STDLIB ONLY** (`strings`/`slices`/`sort`). No new dependency is justified — the deterministic token-AND-substring matcher with a fixed-bucket score + name tiebreak is a few dozen lines, has zero supply-chain/maintenance cost, and is the most testable (assert exact ordering). This *also* honors the project's minimal-deps posture without needing the pre-approval. **If** a future iteration wants forgiving/typo-tolerant matching, the named upgrade is **`github.com/sahilm/fuzzy`** (pure-Go, zero-dep, no-cgo, deterministic stable ordering) — record it as the sanctioned escape hatch, but do not pull it in for v0. (The note "these will be acceptable new dependencies" most plausibly referred to a fuzzy lib like `sahilm/fuzzy` and the already-accepted `gopkg.in/yaml.v3` from S1; neither is needed for *search matching* specifically.) + +### Finding 6: Terminology — rename label → **topic** (`--topic`, `metadata.x-skillrig.topics`). No ecosystem convention forces "tags" + +Verified against the agentskills.io specification (full frontmatter field table, re-fetched): the standard defines **`name`, `description`, `license`, `compatibility`, `metadata`, `allowed-tools`** — and **NO `tags`, `topics`, `keywords`, or `categories` field**. Categorization is *entirely* skillrig's own extension under `metadata.x-skillrig.*`. So: + +- There is **no upstream "tags" convention to honor** — the ecosystem standard is silent on labels; skillrig owns this namespace freely. +- The only nearby ecosystem signal is **GitHub repo *topics*** — gh's `skill publish` adds the `agent-skills` GitHub **topic** for discoverability (S1 Finding 4). GitHub calls these "topics." So "topic" is actually the term the adjacent ecosystem already uses for "a label you discover repos/skills by," which *reinforces* the rename rather than opposing it. +- The collision is real and material: skillrig pins versions with **git tags** (`tag_scheme = "name-vSEMVER"`, e.g. `terraform-plan-review-v1.4.0`; `--pin `). Documenting "filter by `--tag`" next to "pin a git `--pin `" is genuinely confusing for humans and agents. Renaming the discovery label to **topic** removes the ambiguity at no cost. + +**Decision: rename.** Flag `--topic` (repeatable), catalog/data-model field `topics[]`, manifest source `metadata.x-skillrig.topics`. This is a *pre-release* breaking rename and per the PRE-RELEASE marker requires no back-compat. It touches: the manifest key (`x-skillrig.tags` → `x-skillrig.topics`), the `Manifest.Tags` Go field → `Manifest.Topics`, the catalog `tags` JSON key → `topics` (and `CatalogEntry.Tags` → `Topics`), the PoC origin skill frontmatter + regenerated `index.json`, and every spec/data-model mention of "tag" as a label. (Leave true git-tag usages — `--pin`, `tag_scheme` — as "tag.") + +### Finding 7: Where it runs + catalog fields — consumer-side in-memory filter; name/description/topics suffice, NO new field needed + +- **Where:** confirmed **consumer-side, in-memory, deterministic filter over the fetched single-tip catalog.** `search` resolves the origin (`config.ResolveOrigin`), fetches `index.json` (per-call, D-catalog-fetch), parses it (`skillcore` — one catalog parser, AP-04), gates `skillrigConvention` (FR-016), then runs the Finding-4 matcher in memory and renders two-level output. No server, no persisted state, no cache (N1, consume-only). +- **Catalog/frontmatter fields:** the matcher needs only `name`, `description`, and `topics[]` — **all three already exist** in the catalog (`name`, `description`, `tags`→renamed `topics`) and in the frontmatter source (`name`/`description` standard, `tags`→`topics` under `x-skillrig`). **No new field (e.g. an explicit `keywords`) is warranted.** The agentskills.io spec already guides authors to pack *discovery keywords into `description`* ("Should include specific keywords that help agents identify relevant tasks") — so description *is* the keyword field by ecosystem convention; a separate `keywords` field would duplicate that and add drift surface. `version`/`path` stay required so the machine output lets an agent call `add` (FR-003/FR-005). **No data-model field additions; only the `tags`→`topics` rename.** + +## Decisions + +- **D-S5-algo — case-insensitive token-AND **substring** match over `name`+`description`+`topics`, ordered by a fixed-bucket relevance score (exact-name 3 / name-hit 2 / topic-hit 1 / description-only 0) then lexicographic by `name`.** Pure deterministic function of (query, catalog); stable total order via the unique-name tiebreak (SC-002, N6). `--topic T …` is a *separate* exact-string, case-insensitive, AND-across-repeats membership **filter** (FR-002/A3 preserved, just renamed). Empty query + no topic ⇒ list all by name; no match ⇒ empty result, exit 0 (FR-004). **No fuzzy/Levenshtein/embedding** in v0. +- **D-S5-index — keep the single committed flat `index.json` (S2), filtered in memory at search time; no second index artifact (committed or ephemeral).** Option (b) committed inverted index = rejected (duplicate source-of-truth, YAGNI at scale, doubles drift surface); option (c) ephemeral index = degenerates into a plain slice scan ⇒ adopt the plain scan. Git-friendly by construction (one small reviewable JSON; full-regenerate, no churn). Linear scan is adequate to ~10k+ entries in a long-lived process — orders of magnitude past skillrig's ≤ low-hundreds single-tip private-org scale; an index structure is YAGNI. +- **D-S5-deps — STDLIB ONLY (`strings` + `slices`/`sort`); add NO search dependency.** Every external candidate is unjustified at this scale: bleve = git-hostile binary index + heavyweight (REJECT); closestmatch = tuned for long docs, worse on short names (REJECT); lithammer/fuzzysearch = looser subsequence + manual tiebreak, no win (DEFER/REJECT); sahilm/fuzzy = the clean pure-Go/no-cgo/deterministic option but only if/when fuzzy is wanted (DEFER as the named post-v0 upgrade). The pre-approval for new deps is *not exercised* for search. +- **D-S5-topic — RENAME label→topic.** `--topic` flag (repeatable), data-model/catalog field `topics[]`, manifest `metadata.x-skillrig.topics`. The agentskills.io standard defines no tags/topics/keywords field (categorization is skillrig's own `x-skillrig.*`), so no upstream convention is broken; GitHub's own "topics" usage *reinforces* the term; and it removes the genuine collision with git-tag version pins (`--pin `, `name-vSEMVER`). Pre-release → no back-compat. Git-tag usages stay "tag." +- **D-S5-where — consumer-side in-memory filter over the per-call-fetched single-tip catalog; NO new catalog/frontmatter field.** `name`+`description`+`topics` (all already present) are sufficient; `description` is the ecosystem-sanctioned keyword field, so no separate `keywords`. Only field change anywhere is the `tags`→`topics` rename. + +## Recommendations + +1. **Implement the matcher in `pkg/skillcore` as a pure, presentation-free function** — e.g. `SearchCatalog(cat Catalog, query string, topics []string) []CatalogEntry` — so it is unit-testable on ordering/determinism (assert exact result *order*, not just membership — constitution §II) and reused by any caller (AP-04). `internal/cli`'s `search` command does fetch → `ParseCatalog` → convention-gate → `SearchCatalog` → two-level render. Keep all matching in `skillcore`; keep rendering in `cli`. +2. **Spell the algorithm out in `search --help`** (one runnable example with a query, one with `--topic`) so an agent succeeds first-try (SC-008): "matches QUERY as space-separated terms (all must appear) against a skill's name, description, and topics; `--topic` filters to skills carrying that topic; results are listed best-match-then-alphabetical." +3. **Do the `tags`→`topics` rename in the same commit as the S1 manifest migration** (it touches the same `Manifest` struct + origin frontmatter + `index.json`), so the format is written once. Update: `metadata.x-skillrig.tags`→`topics`; `Manifest.Tags`→`Topics`; `Catalog`/`CatalogEntry` `tags` JSON→`topics`; the PoC origin skill + regenerated `index.json`; data-model.md §1/§2 and spec/spec-tech "tag"→"topic" (except git-tag/`--pin`). +4. **Record the YAGNI boundaries as explicit non-goals** (so a reviewer doesn't reopen): no inverted index, no fuzzy/Levenshtein, no embeddings/vector/semantic, no `keywords` field, no search cache/server in v0. The named upgrade path *if forgiving matching is ever needed* is `github.com/sahilm/fuzzy` (pure-Go, no-cgo, deterministic) — not bleve. +5. **Add determinism tests** asserting byte-identical output ordering across runs and across shuffled input-catalog order (SC-002), plus the empty-query=list-all and no-match=exit-0 cases (FR-004), and `--topic` AND-narrowing (FR-002). + +## Scope flags (things that would EXPAND scope — guard against) + +- **Free-text ranking creep.** Keep the score a *fixed 4-bucket lookup* + name tiebreak. Do **not** introduce TF-IDF/BM25/corpus-statistics ranking — it makes ordering depend on the whole catalog (still deterministic, but fragile as the catalog grows, harder to test, and unnecessary at this scale). Out of scope. +- **Fuzzy/typo tolerance.** Deferred to post-v0 (`sahilm/fuzzy`); not in this slice. Pulling it in now is the UX hazard of Finding 4. +- **A `keywords` frontmatter field.** Rejected (Finding 7) — `description` is the keyword field by ecosystem convention. Adding it expands the manifest contract for no gain. +- **Search caching / a persisted index.** Out of scope (N1 consume-only, D-catalog-fetch per-call). Any cache is a later optimization, not v0. +- The **`tags`→`topics` rename is a data-model + spec change** (the only required artifact change from this spike) — see "Required artifact changes." + +## Required artifact changes (from this spike) + +- **data-model.md §1**: `Manifest.Tags []string` (from `metadata.x-skillrig.tags`) → **`Topics []string`** (from `metadata.x-skillrig.topics`); update the example frontmatter (`x-skillrig.tags` → `x-skillrig.topics`). +- **data-model.md §2**: `Catalog`/`CatalogEntry` `Tags`/`tags` → **`Topics`/`topics`**; the comment "search --tag filters on these" → "search --topic filters on these"; add `topics` to the documented `search`-consumed schema. +- **data-model.md (new)**: document `SearchCatalog(cat, query, topics) []CatalogEntry` (or equivalent) as the single matcher in `skillcore`, with the matching + ordering rules above; note the entity "Topic (tag)" → **"Topic"**. +- **spec.md**: FR-002 keep "deterministic, no inference/ranking" for the **`--topic` filter**; *add* an FR (or extend FR-001/FR-005) for the **`[QUERY]` free-text match over name+description+topics** with the deterministic score-then-name ordering; rename "Topic (tag)" entity and `--tag`→`--topic`; Acceptance Scenarios US1.2/US1.3 reword "topic" (already mostly "topic"); keep US1.1 "list all" for empty query. +- **spec-tech.md §1**: command surface `skillrig search [QUERY] [--tag T ...]` → **`[--topic T ...]`**; note the matcher lives in `skillcore` (AP-04) and is stdlib-only. +- **Origin template (FR-023)**: in the same S1 frontmatter migration, rename `x-skillrig.tags`→`topics` and regenerate `index.json` with `topics`. +- **No new dependency in go.mod** for search (stdlib only). The only S1-introduced dep (`gopkg.in/yaml.v3`) is unrelated to search. + +## References + +- N6 / determinism + offline, no LLM/vector ranking — binding spike constraints (architecture §9/§10, requirements N6). +- Single-tip catalog (one entry per HEAD skill; version history = git tags via `--pin`) — `specledger/003-search-remote/research/2026-05-31-catalog-generation-lifecycle.md` (D-S2-tip). +- Catalog field-source = frontmatter `metadata.x-skillrig.*`; `search` consumes `name/version/description/tags(→topics)/path` + root `skillrigConvention/origin` — `specledger/003-search-remote/research/2026-05-31-skill-manifest-format.md` (D-S1-catalog-source); data-model.md §2. +- agentskills.io spec frontmatter field table (NO tags/topics/keywords field; `description` "should include specific keywords"; `metadata` = arbitrary client extension) — https://agentskills.io/specification. +- gh has no catalog — code-search + GitHub repo **topic** (`agent-skills`) for discovery — S1 Finding 4 (`/Users/vincentdesmet/specledger/skillrig/gh-cli` `pkg/cmd/skills/search`, `publish` topic-add). +- Real origin catalog (1 skill today; single-tip shape) — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/index.json`. +- git-tag version scheme (the collision) — `.skillrig-origin.toml` `tag_scheme = "name-vSEMVER"`; `--pin ` (spec-tech.md §5). +- Library evaluation: + - `github.com/blevesearch/bleve` — persists a binary on-disk index, heavyweight transitive deps (git-hostile; REJECT). + - `github.com/sahilm/fuzzy` v0.1.2 — pure Go, zero deps, no cgo, deterministic descending-quality order (the named post-v0 fuzzy upgrade; DEFER). + - `github.com/lithammer/fuzzysearch` — pure Go, subsequence match + Levenshtein rank, needs manual stable tiebreak (no v0 win). + - `github.com/schollz/closestmatch` — pure Go n-gram bag-of-words, tuned for long docs, *worse on short single-word names* per its own docs (wrong tool). +- Go stdlib matcher primitives — `strings.ToLower/Contains/Fields`, `slices.SortFunc`/`sort` (the recommended implementation). +- `pkg/skillcore` single-parser/single-impl discipline (AP-04) and presentation-free rule — CLAUDE.md; data-model.md. diff --git a/specledger/003-search-remote/research/2026-05-31-skill-manifest-format.md b/specledger/003-search-remote/research/2026-05-31-skill-manifest-format.md new file mode 100644 index 0000000..578868d --- /dev/null +++ b/specledger/003-search-remote/research/2026-05-31-skill-manifest-format.md @@ -0,0 +1,145 @@ +# Research: Skill Manifest Format — `skill.toml` vs agentskills.io frontmatter + `x-skillrig.*` + +**Date**: 2026-05-31 +**Context**: Spike S1 for `003-search-remote` (spec-tech.md §8b). The catalog `search` reads is *generated from* skill metadata, so the manifest format is the field-source decision that gates the rest of 003. 002 shipped a sibling `skill.toml`; the working hypothesis is to drop it and move metadata into `SKILL.md` frontmatter with skillrig extensions. +**Time-box**: ~30 min +**Confidence**: HIGH (every load-bearing claim verified against checked-out gh-cli source and the live agentskills.io spec). + +## Question + +Should 003 keep the current `skill.toml` sibling manifest (002) or move skill metadata into agentskills.io **frontmatter** in `SKILL.md` with skillrig extensions under the standard's `metadata` section? And if migrating: can `allowed-tools` express skillrig's `[[requires]]`, what is the concrete `x-skillrig.*` shape, how big is the migration, and must a "manifest reframe" land before 003? + +## Findings + +### Finding 1: The agentskills.io standard — exact field set (authoritative) + +Fetched https://agentskills.io/specification. The complete frontmatter field table: + +| Field | Required | Constraint (verbatim) | +|---|---|---| +| `name` | Yes | ≤64 chars, lowercase alnum + hyphens, no leading/trailing/`--`. **Must match parent dir name.** | +| `description` | Yes | ≤1024 chars, non-empty. | +| `license` | No | License name or reference to bundled license file. | +| `compatibility` | No | ≤500 chars. Free text: "intended product, required system packages, network access, etc." | +| `metadata` | No | **"A map from string keys to string values. Clients can use this to store additional properties not defined by the Agent Skills spec... We recommend making your key names reasonably unique to avoid accidental conflicts."** | +| `allowed-tools` | No | **"A space-separated string of tools that are pre-approved to run" (Experimental).** Example: `allowed-tools: Bash(git:*) Bash(jq:*) Read`. | + +There is **no `version` field and no `tags` field** in the standard. The spec's own example puts `version: "1.0"` *inside* `metadata` (`metadata: {author: ..., version: "1.0"}`) — confirming version/tags are extension territory, not standard fields. + +**Key consequence:** `metadata` is the officially-sanctioned, namespaced extension mechanism. This is exactly what the hypothesis relies on, and the spec explicitly blesses it. + +### Finding 2: `allowed-tools` CANNOT express skillrig's `[[requires]]` — decisive + +The hypothesis floated `allowed-tools` as "the natural home for backing-CLI `requires`." **This is wrong**, on two independent confirmations: + +1. **Spec semantics**: `allowed-tools` is "tools pre-approved to *run*" inside an agent session — `Bash(git:*) Read`. It is an agent-permission allowlist (which tool *invocations* the agent may make), not a backing-CLI prerequisite list. It is a flat space-separated **string**, with no slot for a version constraint (`>=0.4.0`) or a `source` repo. + +2. **gh-cli enforces this**: `pkg/cmd/skills/publish/publish.go:262-270` actively **rejects** `allowed-tools` as an array — "allowed-tools must be a string (space-delimited), not an array." Tests at `publish_test.go:390,521` use `allowed-tools: git` (bare name). So even the reference client treats it as bare tool names only. + +skillrig's `Require` (`pkg/skillcore/manifest.go:24-29`) carries `tool` + `version` (constraint) + `source` (private repo) + `manager`. None of that survives in `allowed-tools`. **`requires` MUST live under `x-skillrig.*`.** `compatibility` (free-text ≤500 chars) is also unsuitable as a structured source — it's human prose, not machine-parseable version constraints. + +### Finding 3: How the Go gh CLI parses frontmatter (validation #1) + +Parser: `/Users/vincentdesmet/specledger/skillrig/gh-cli/internal/skills/frontmatter/frontmatter.go`. + +- **YAML lib**: `gopkg.in/yaml.v3` (line 9). Standard, robust. +- **Robustness**: `Parse` (lines 31-63) trims leading `\r\n`, requires a leading `---`, finds the closing `\n---`, and `yaml.Unmarshal`s the block **twice** — once into a typed `Metadata` struct (line 53-56) and once into a raw `map[string]interface{}` (line 48-51) preserved as `RawYAML`. No frontmatter → returns the whole content as `Body`, no error (graceful). Invalid YAML → hard error `"invalid frontmatter YAML: %w"`. +- **The `Metadata` struct** (lines 14-20): `Name`, `Description`, `License`, and crucially `Meta map[string]interface{} \`yaml:"metadata,omitempty"\``. So gh treats `metadata` as an **arbitrary nested map** — `interface{}` values, NOT restricted to strings (more permissive than the spec's "string→string" letter). +- **Extension pattern (the precedent that matters)**: `InjectGitHubMetadata` (lines 70-98) writes provenance into `metadata` under **flat, dotted-prefix keys**: `github-repo`, `github-ref`, `github-tree-sha`, `github-path`, `github-pinned`. Test `frontmatter_test.go:172-184` confirms the on-disk shape is flat (`metadata.github-tree-sha: tree456`), NOT a nested `metadata.github: {...}` sub-map. The comment (lines 65-68) states the intent verbatim: *"Keys are prefixed with `github-` to avoid collisions with other tools' metadata."* + +This is direct, first-party validation of the hypothesis's approach: a namespaced-key-prefix under `metadata` is exactly how the in-stack reference client already extends the standard. + +### Finding 4: Prior art on search-by-topic (validation #5) — gh has NO generated catalog + +This reframes 003's catalog design. + +- **`gh skill search`** (`pkg/cmd/skills/search/search.go`): there is **no `index.json`**. Search hits the **GitHub Code Search API** for `filename:SKILL.md ` (line 288), with variants for `path:` and `user:` scoping (lines 290-334). `--json` fields are `repo, skillName, namespace, description, stars, path` (lines 39-47). **No `--tag` flag, no tag filtering** — relevance ranking is name-match-first over code search results. +- **Discoverability = repo topic**, not a catalog. `gh skill publish` adds the **`agent-skills` GitHub topic** to the repo (`publish.go:483-493,692-699`); that topic is how skills become findable. Versioning = git tags + GitHub Releases (`publish.go:459-469`, immutable releases). +- **Vercel `npx skills` / `sl`**: architecture §4.2 lineage note + §11b confirm Vercel uses a lockfile + tree-SHA model (cited for the two-lock split), but no evidence of a tag-faceted search catalog; gh is the closer prior art and it does *topic + code-search*, not a faceted index. + +**Implication for 003's `search`**: skillrig's design (a generated `index.json` carrying `tags[]`, gated by `skillrigConvention`) is a **deliberate divergence** from gh's "code-search + repo-topic" model — it gives skillrig the org-controlled, `--tag`-filterable, offline-coherent catalog that gh lacks (architecture §366 already flags gh's "discovery leans on GitHub topic conventions, not a fully org-controlled artifact" as a contract gap). So 003 keeps the generated catalog, but S1's manifest format becomes its field-source. + +### Finding 5: Current state — frontmatter and `skill.toml` already coexist (partial migration already real) + +The PoC origin skill `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/skills/terraform-plan-review/` **already has agentskills.io frontmatter** in `SKILL.md` (`name` + `description`) AND a full `skill.toml` (name, version, namespace, description, tags, two `[[requires]]`). So `name`/`description` are **duplicated** today — a latent drift bug. The `index.json` is generated from the richer `skill.toml` shape. + +### Finding 6: Migration scope (validation #4) — small and self-contained + +Every `skill.toml` touch-point in the CLI: + +- **`pkg/skillcore/manifest.go`** — `Manifest`/`Require` structs + `ParseManifest(path)` (reads a TOML file via `pelletier/go-toml/v2`). ~47 lines. This is the *only* parse implementation (AP-04 single-impl already holds). +- **`pkg/skillcore/add.go:91`** — the sole caller: `ParseManifest(filepath.Join(srcDir, "skill.toml"))`. +- **`pkg/skillcore/verify.go:193-195`** — `isSkillDir` treats a dir as a skill if it contains `skill.toml` **OR** `SKILL.md`. (Already SKILL.md-aware; can simply drop the `skill.toml` marker.) +- **`pkg/skillcore/treesha.go:2`** — doc comment only. +- **Tests** — `manifest_test.go`, `add_test.go`, `verify_test.go`, `treesha_test.go`, `helpers_test.go` (`sampleManifest` fixture), `test/skillcore_quickstart_test.go`. All write a `skill.toml` fixture; these flip to writing frontmatter. +- **Origin template** — `skill.toml` → fold into `SKILL.md` frontmatter; regenerate `index.json` from frontmatter; update `scripts/build-index.sh` (this is FR-023, already in scope and entangled with S2). +- **Docs** — architecture §4.1 (`skill.toml` schema block), §307 (`internal/index` walks `skills/*/skill.toml`), CLAUDE.md, ROADMAP, cli.md references. + +**The migration is a focused refactor of one parser + its callers + fixtures.** No new dependency: gh uses `gopkg.in/yaml.v3` and that is the right choice (the project already commits to "coupling to the standard, already incurred via R20-R21"; YAML is the standard's serialization). Adds **one dependency** (`gopkg.in/yaml.v3`) — a deliberate, justified exception to "no new deps," because frontmatter IS YAML and there is no TOML escape. `pelletier/go-toml/v2` stays (config + lock-adjacent). + +### Finding 7: Concrete `x-skillrig.*` shape + +The spec says `metadata` is string→string and recommends unique keys; gh's precedent is **flat dotted keys** (`github-tree-sha`). But skillrig needs *structured* data (`tags[]`, `requires[]` of objects). Two viable encodings: + +**Option A — nested map under a single namespaced key (recommended):** +```yaml +--- +name: terraform-plan-review +description: Review a terraform plan for risk and drift before apply... +license: Proprietary +metadata: + x-skillrig.version: "1.4.0" + x-skillrig.namespace: my-org + x-skillrig.convention-version: "1" + x-skillrig.tags: platform-team terraform aws # space-delimited, mirrors allowed-tools convention + x-skillrig.requires: # nested list-of-maps (yaml.v3 / interface{} parse it fine) + - tool: oxid + version: ">=0.4.0" + source: my-org/my-skills + manager: mise + - tool: terraform + version: ">=1.6" + source: hashicorp/terraform + manager: mise +--- +``` +gh-cli's `map[string]interface{}` parses this losslessly. It *bends* the spec's "string→string" letter for `x-skillrig.requires` (nested), but: (a) the spec's own field is `map[string]interface{}` in the reference impl, (b) it's namespaced so it can't collide, (c) tags-as-space-string keeps the cheap fields spec-pure. Use dotted prefix `x-skillrig.` per the spec's "unique key" guidance and gh's `github-` precedent. + +**Option B — fully flat (most spec-literal), tags/requires as encoded strings:** avoid for `requires` (encoding a list-of-objects into a string is ugly and lossy). Only adopt if a downstream agentskills.io client is found that hard-rejects non-string metadata values — none found in gh. + +**Recommendation: Option A.** Map skillrig fields as: `name`/`description`/`license` → standard top-level; `tags`/`version`/`namespace`/`convention-version`/`requires` → `metadata.x-skillrig.*`. Do NOT put `requires` in `allowed-tools` (Finding 2). + +## Decisions + +- **D-S1-format — MIGRATE to frontmatter + `x-skillrig.*`; drop `skill.toml`.** Confirmed by every validation axis: the standard's `metadata` field exists *specifically* for this (Finding 1), gh-cli proves the namespaced-key pattern in production (Finding 3), and the sibling-file rationales ("two audiences / travels-with-skill / offline-doctor / TOML-nicer") don't survive — frontmatter travels with `SKILL.md` atomically (no name/description drift, which the PoC origin *currently has*), is offline-readable by the same parse, and aligns with 26+ ecosystem clients. Cost is YAML-vs-TOML cosmetics + one new dep (`yaml.v3`) — both already implied by standard-coupling. +- **D-S1-requires — `requires` lives under `metadata.x-skillrig.requires`, NOT `allowed-tools`.** `allowed-tools` is an agent-permission string (bare tool names), provably unable to carry version+source; gh-cli rejects arrays there. This corrects the hypothesis. +- **D-S1-catalog-source — the catalog (`index.json`) is generated FROM frontmatter `metadata.x-skillrig.*` + standard `name/description`.** `search`'s consumed schema (`name, version, description, tags[], path`, catalog-level `skillrigConvention, origin`) sources `version`/`tags` from `x-skillrig.*`, `name`/`description` from standard frontmatter, `path` from directory location. This is the field-source answer S2 depends on. +- **D-S1-sequencing — the manifest reframe is SMALL and lands IN-SLICE, as 003's first commit.** It is one parser (`manifest.go`) + one caller (`add.go:91`) + `isSkillDir` (already SKILL.md-aware) + fixtures + origin-template/`index.json`. Not big enough to warrant a separate prerequisite feature. Doing it first (before remote `add`/`search`) means the fetched-subtree read and catalog-parse are both written against the new format from the start, avoiding a double rewrite. + +## Recommendations + +1. **Land the manifest reframe as commit 1 of 003.** Replace `ParseManifest` to read `SKILL.md` frontmatter (gopkg.in/yaml.v3): parse standard `name/description/license`, then map `metadata` → pull `x-skillrig.version/namespace/tags/requires/convention-version`. Keep the `Manifest`/`Require` Go types nearly as-is (their fields are format-agnostic). Add `gopkg.in/yaml.v3` to go.mod with a one-line rationale comment (frontmatter is YAML; no TOML path exists for the standard). +2. **Adopt `x-skillrig.*` Option A** (nested map under `metadata`, dotted-prefix keys; `tags` as space-string, `requires` as nested list-of-maps). Document this as the convention-1 manifest contract alongside the catalog contract. +3. **Fix the origin template in the same branch (FR-023):** fold `skill.toml` into `SKILL.md` frontmatter, delete the duplicate `name`/`description` drift, and regenerate `index.json` + `build-index.sh` from frontmatter. Coordinate with S2 (catalog generation) — S2 now has its field-source nailed down. +4. **Keep skillrig's generated `index.json` catalog** (do NOT adopt gh's code-search + repo-topic discovery). It is the org-controlled, `--tag`-filterable, offline-coherent artifact gh lacks; just re-point its generator at frontmatter. +5. **Update `isSkillDir`** to key on `SKILL.md` only (drop the `skill.toml` marker) once fixtures migrate, so verify's orphan-detection stays correct. +6. **Update docs in-branch** (CLAUDE.md per the same-branch rule): architecture §4.1 (replace TOML block with frontmatter), §307 (index generator walks `SKILL.md`), cli.md/ROADMAP. + +## Risks + +- **Spec-letter vs. practice on nested metadata.** The spec says `metadata` is string→string; `x-skillrig.requires` (nested list) bends that. Mitigation: gh's reference impl uses `map[string]interface{}` (parses nested fine); the data is namespaced so it can't collide with other clients; only skillrig reads it. LOW risk, but note it in the contract. If a stricter validator (`skills-ref validate`) rejects non-string metadata, fall back to a JSON-encoded string for `x-skillrig.requires` only — verify against `skills-ref` before finalizing (not done in this time-box; flag for plan). +- **New dependency (`gopkg.in/yaml.v3`).** Violates the "no new deps" stance literally, but is unavoidable for a YAML standard and matches gh. Accept with rationale. +- **Coupling to a moving standard.** agentskills.io `allowed-tools` is marked "Experimental"; the standard "may change." skillrig only consumes stable fields (`name`/`description`/`metadata`) and owns its `x-skillrig.*` namespace, so exposure is limited. Already-incurred per project history. +- **`name` must match parent dir name (spec rule).** skillrig's add/verify should keep honoring this; the directory-derived skill identity already aligns with it. Minor: enforce in lint, not blocking for 003. + +## References + +- agentskills.io specification — https://agentskills.io/specification (frontmatter field table; `metadata`, `allowed-tools`, `compatibility` definitions quoted in Finding 1). +- gh-cli frontmatter parser — `/Users/vincentdesmet/specledger/skillrig/gh-cli/internal/skills/frontmatter/frontmatter.go` (yaml.v3; `Metadata.Meta map[string]interface{}`; `InjectGitHubMetadata` flat `github-`-prefixed keys, lines 14-20, 65-98). +- gh-cli frontmatter test — `.../internal/skills/frontmatter/frontmatter_test.go:172-184` (flat `metadata.github-tree-sha` round-trip). +- gh-cli `allowed-tools` validation — `.../pkg/cmd/skills/publish/publish.go:262-270` ("must be a string, not an array"); tests `publish_test.go:390,521`. +- gh-cli search (code-search + topic, no catalog) — `.../pkg/cmd/skills/search/search.go:288-334,39-47`; publish topic add `publish.go:483-493,692-699`. +- skillrig 002 manifest — `pkg/skillcore/manifest.go` (`ParseManifest`, `Manifest`, `Require`); sole caller `pkg/skillcore/add.go:91`; `isSkillDir` `pkg/skillcore/verify.go:193-195`. +- PoC origin manifest — `/Users/vincentdesmet/specledger/skillrig/skillrig-origin/skills/terraform-plan-review/skill.toml` + `SKILL.md` (frontmatter already present; name/description duplicated) + `index.json`. +- Architecture — `docs/ARCHITECTURE-v0.md` §4.1 (skill.toml schema, lines 161-184), §4.2 (treeSha/commit), §307 (index generator), §366 (gh discovery-via-topic gap), §370 (OpenClaw requires prior art). +- spec-tech.md §8b — `specledger/003-search-remote/spec-tech.md` (Spike S1 framing, FR-023). diff --git a/specledger/003-search-remote/reviews/003-review.md b/specledger/003-search-remote/reviews/003-review.md new file mode 100644 index 0000000..11be2d5 --- /dev/null +++ b/specledger/003-search-remote/reviews/003-review.md @@ -0,0 +1,91 @@ +--- +date: 2026-05-31 +total_requirements: 37 +total_tasks: 0 +coverage_pct: "86% (32/37 fully covered; 5 gaps flagged C1/C5/C6/C8/C9 — all remediated)" +critical_issues: 0 +--- + +# Specification Analysis Report — 003-search-remote (no-tasks cross-verify) + +**Scope:** spec.md ↔ spec-tech/plan/research/data-model/contracts/quickstart. Tasks dimension intentionally skipped. 2 independent reviewers (Opus, loading agentic-go-cli-design + golang-spf13-cobra + golang-testing) merged. Produced by `/specledger.verify-workflow` (disk brief + template). + +> All findings below were **remediated 2026-05-31** (C1 per the exact-match `== 1` decision). See the Remediation section at the end. + +| ID | Source | Category | Severity | Location(s) | Summary | Recommendation | +|----|--------|----------|----------|-------------|---------|----------------| +| C1 | r1,r2 | Consistency/Ambiguity (convention policy) | HIGH | spec-tech.md:51; data-model.md:97,141; contracts/search.md:21; quickstart.md:20 | Convention-version policy undecided AND contradictory: data-model:97 "supports 1" (exact-match) vs :141 trigger `> supported` (forward-only — convention 0 or a missing field silently PASSES). Only `convention:2` tested, so FR-016/SC-005 unverifiable. | Pick exact-match `== 1` (YAGNI); make :97/:141 agree (`> supported`→`!= 1`); align data-model §2/§5 + search.md step 3; add a non-`>` boundary quickstart case. | +| C2 | r2 | Consistency/Ambiguity (FR-015) | MEDIUM | contracts/add-remote.md:39; research.md:53 (D5); data-model.md:132-141 | "no such version" vs "skill not found" modeled as a `NotFoundError` *variant* with no structured discriminator — message-only differentiation, which errors-as-navigation cautions against (agents/CI can't branch on prose). Test asserts only a substring. | Add a distinct typed error (`NoSuchVersionError`) or a `kind` discriminator; assert it in the test, not just a substring. | +| C3 | r1 | Decision integrity/Ambiguity (`--pin`) | MEDIUM | spec-tech.md:57; contracts/add-remote.md:14; data-model.md:116 | `--pin` accepts bare semver / full `name-vSEMVER` tag / raw SHA with no disambiguation rule; FR-013/014/015 + SC-004 depend on a single deterministic resolution. | Specify resolution order (`^v?SEMVER$`→expand via tag_scheme; else literal ref/SHA) + precedence; quickstart bare-semver vs full-tag equivalence. | +| C4 | r1,r2 | Decision integrity (D8 tag→topic) | MEDIUM | spec-tech.md:33,102; contracts/index.md:42; research.md:13 | Residual `tag`/`tags`/`--tag` from the D8 rename in binding artifacts. | Replace with `topic`/`topics` (historical S1 quote + data-model "renamed from tags" may stay). | +| C5 | r1 | Coverage (add --help) | MEDIUM | contracts/add-remote.md:42; quickstart.md US2; spec.md SC-008 | `TestQuickstart_AddHelpExamples` referenced in the contract but absent from quickstart.md; SC-008 mapped only to US1. | Add the scenario to US2 (purpose + ≥2 examples incl. `--pin`); map SC-008 to US2. | +| C6 | r1 | Coverage (add --dry-run) | MEDIUM | spec.md:169 (FR-020); contracts/add-remote.md:15,27; quickstart.md US2 | No scenario exercises `add --dry-run` for the remote path; FR-020 dry-run clause unverified. | Add `TestQuickstart_AddDryRun` (bounded preview, exit 0, no FS/lock mutation). | +| C7 | r1 | Consistency (producer hardcodes convention) | LOW | contracts/index.md:21,26; spec-tech.md:30; data-model.md:97 | `index` "carry `skillrigConvention: 1`" as a literal vs `.skillrig-origin.toml` `convention_version` being the source of truth — a future divergence self-rejects. | State `index` reads `skillrigConvention` from `.skillrig-origin.toml`, not a hardcoded 1. | +| C8 | r1 | Coverage (index not-in-origin) | LOW | contracts/index.md:32,39; quickstart.md US5 | `index` exit-1 "not in an origin repo / unreadable skills_dir" path has no scenario (only malformed-frontmatter). | Add `TestQuickstart_IndexNotInOrigin`. | +| C9 | r1 | Consistency (seed vs validation) | LOW | data-model.md:54; plan.md:90 | "version required for catalog entries" hard rule vs plan step-6 seeded skills needing manual `x-skillrig.version` enrichment; no case covers a skill missing it. | Add a case asserting `index` fails clearly on missing `x-skillrig.version`; make enrichment a checked precondition of the oracle. | +| C10 | r2 | Consistency (matcher name) | LOW | plan.md:85; data-model.md:147 | Matcher named `SearchCatalog` (plan) vs `Search` (data-model §5b). | Unify on `Search` (the authored signature); fix plan.md:85. | +| C11 | r2 | Consistency (under-cited decisions) | LOW | data-model.md:3 | Header says "D1–D7" but the doc rests on D8 (§5b, topics). | Change to "D1–D8". | +| C12 | r2 | Coverage/Traceability (under-cited FRs) | LOW | quickstart.md:58-66 | US1 row omits FR-003/FR-004 (covered) and cites FR-020 only under US4. Citation gap, not a real coverage gap. | Add FR-003/FR-004 to US1; reference FR-020 from US1. | +| C13 | r2 | Consistency (help header miscite) | INFO | contracts/search.md:41 | "Help (FR-018/SC-008)" — FR-018 is the unreachable requirement, unrelated to help. | Change to "Help (SC-008/FR-020)". | +| C14 | r1,r2 | Constitution (§III divergence adequacy) | INFO | plan.md:30; constitution.md:78,82-83 | The two recorded §III divergences are ADEQUATELY handled (httptest-not-applicable; skill.toml→SKILL.md), correctly routed to the FR-024 team-approval sweep; no MUST violation. Also: §IX `scripts/run_eval.py` path is stale. | No plan change; ensure the FR-024 sweep enumerates §III:78, §III:82-83, and the stale §IX `run_eval.py` path. | + +### Coverage summary + +| Requirement | Plan | Contract | Quickstart test | Status | +|-------------|------|----------|-----------------|--------| +| FR-001 | step 5 | search.md | SearchListsSkills | Covered | +| FR-002 | step 5 | search.md | SearchQueryMatchesNameDesc / SearchOrderingDeterministic | Covered | +| FR-002a | step 5 | search.md | SearchFilterByTopic | Covered | +| FR-003 | step 5 | search.md | SearchJSONComplete | Covered (C12 citation) | +| FR-004 | step 5 | search.md | SearchEmptyResult | Covered (C12 citation) | +| FR-005 | step 5 | search.md | SearchListsSkills | Covered | +| FR-006..012 | step 6 | add-remote.md | AddRemoteNoLocalCopy/Idempotent/ForceOnDivergence/LocalPathStillWorks/ClassifyNotFound | Covered | +| FR-013/014 | step 6 | add-remote.md | AddPinnedReproducible | Covered (C3 pin grammar) | +| FR-015 | step 6 | add-remote.md | AddPinNotFound | Covered, was message-only (C2) | +| FR-016 | step 5/6 | search.md/add-remote.md | SearchConventionMismatch | Gap → fixed (C1) | +| FR-017/018/019 | step 6 | add-remote.md/search.md | ClassifyAuth/Unreachable / VerboseShowsRawCause | Covered | +| FR-020 | step 5/6 | search.md/add-remote.md | SearchJSONComplete; add --dry-run | Gap → fixed (C6) | +| FR-021/022 | step 5/6 | search.md/add-remote.md | SearchEmptyResult / AddRemote* | Covered | +| FR-023/024 | step 7 | (process) | — | Covered as process deliverables (C4 wording) | +| FR-025..028 | step 4 | index.md | IndexGenerates/Deterministic/MatchesCommitted | Covered | +| SC-001..004,006,007 | step 5/6 | search/add-remote | AddRemote*/Pinned/Idempotent/LocalPath | Covered | +| SC-005 | step 5/6 | search/add-remote | Auth/Unreachable/ConventionMismatch | Gap → fixed (C1) | +| SC-008 | step 5/6 | search/add-remote | SearchHelpExamples; add help | Gap → fixed (C5) | +| SC-009 | step 4 | index.md | IndexGenerates/Deterministic/MatchesCommitted | Covered (C9 precondition) | + +### Decision integrity + +D1 frontmatter ✓ (stale `tags` at research.md:13 — C4) · D2 catalog ✓ · D3 local-vs-remote ✓ · D4 auth ✓ · D5 lock/pin ✓ (C2/C3 caveats) · D6 test substrate ✓ (§III divergence adequate — C14) · D7 transport ✓ · D8 query/topic ✓ (residual `tag` — C4). Session 2026-05-31 bullets (S1–S5, A1, yaml.v3) all applied. + +### Metrics + +- Requirements: 28 FR + 9 SC · Reviewers: 2 · Critical: 0 · High: 1 · Medium: 5 · Low: 6 · Info: 2 · Coverage gaps: 5. + +### Next actions + +- C1 (HIGH) resolved via exact-match `== 1`; C2–C13 remediated; C14 folded into the FR-024 sweep list. See Remediation below. + +--- + +# Remediation — 2026-05-31 + +All 14 findings resolved (C1 per the user's exact-match `== 1` decision; "fix all now"). Artifacts re-validated (`go build ./...` clean; no operative `--tag`/`D1–D7`/`> supported`/`SearchCatalog` stragglers — only historical spike narrative retains old terms, which review C4/C10 permit). + +| ID | Sev | Status | Fix | +|----|-----|--------|-----| +| C1 | HIGH | ✅ Fixed | Exact-match policy: data-model §2 gate reworded + §5 trigger `> supported`→`!= 1` (incl. lower/absent); contracts/search.md step 3; spec-tech.md:51 (Q14 reframed as future change). New `TestQuickstart_SearchConventionBoundary` (0/absent → fail, 1 → pass). | +| C2 | MED | ✅ Fixed | New typed **`NoSuchVersionError`** in data-model §5 (now "Four typed errors"; raised from failed pin ref-resolution, not stderr); add-remote.md errors reworded; `TestQuickstart_AddPinNotFound` asserts the typed discriminator, not a substring. | +| C3 | MED | ✅ Fixed | Deterministic `--pin` rule (bare `^v?SEMVER$`→tag_scheme expand; else literal ref/SHA) in add-remote.md flag + data-model §3; new `TestQuickstart_AddPinTagFormEquivalent` (bare-semver == full-tag → same commit/treeSha). | +| C4 | MED | ✅ Fixed | `tag`→`topic` at spec-tech.md:33,102; contracts/index.md:42; research.md:13 (D1). Historical S1 quote left. | +| C5 | MED | ✅ Fixed | `TestQuickstart_AddHelpExamples` added to quickstart US2 (purpose + ≥2 examples incl. `--pin`); SC-008 mapped to US2 in traceability. | +| C6 | MED | ✅ Fixed | `TestQuickstart_AddDryRun` added (bounded preview, exit 0, no FS/lock mutation). | +| C7 | LOW | ✅ Fixed | contracts/index.md: `skillrigConvention` read from `.skillrig-origin.toml` `convention_version`, not a hardcoded 1 (shared source with the gate). | +| C8 | LOW | ✅ Fixed | `TestQuickstart_IndexNotInOrigin` added (not-in-origin/unreadable skills_dir → exit 1 navigation message). | +| C9 | LOW | ✅ Fixed | `TestQuickstart_IndexMissingVersion` added; plan step 6 marks `x-skillrig.version` enrichment a **checked precondition** of the `IndexMatchesCommitted` oracle. | +| C10 | LOW | ✅ Fixed | Matcher name unified on `Search(...)` (data-model §5b); plan.md:85 updated from `SearchCatalog`. | +| C11 | LOW | ✅ Fixed | data-model.md:3 "D1–D7"→"D1–D8". | +| C12 | LOW | ✅ Fixed | US1 traceability row gains FR-003/004/020; SC-005 added; US2 gains FR-020 + SC-008. | +| C13 | INFO | ✅ Fixed | contracts/search.md help header "FR-018/SC-008"→"SC-008/FR-020". | +| C14 | INFO | ✅ Fixed | plan step 7 enumerates the team-approval constitution touch-ups: §III:78 (skill.toml→SKILL.md), §III:82-83 (httptest→exec-stub seam), §IX stale `run_eval.py` path. | + +**Gate after remediation:** `go build ./...` clean; 5 prior coverage gaps (FR-016/SC-005, FR-020 dry-run, SC-008 add-help, index not-in-origin, index missing-version) now each carry a `TestQuickstart_*` scenario. Artifacts internally consistent — **clear to proceed to `/specledger.implement-workflow`**. diff --git a/specledger/003-search-remote/sessions/003-search-remote-checkpoint.md b/specledger/003-search-remote/sessions/003-search-remote-checkpoint.md new file mode 100644 index 0000000..1ad8233 --- /dev/null +++ b/specledger/003-search-remote/sessions/003-search-remote-checkpoint.md @@ -0,0 +1,121 @@ +# Session Log: 003-search-remote + +## Divergence Review: 2026-05-31 + +**Scope:** post-implementation (HEAD `429f0ca`, clean tree). Adversarial divergence review of the implemented branch vs spec/plan/data-model/contracts/quickstart. Implemented via `/specledger.implement-workflow` (11 agents; the experiment deliberately skips the `sl issue` ledger — the gate is quickstart tests + `make check`). + +**Verdict (CORRECTED — see Adversarial Deep-Dive below):** the divergence-review pass below reported "0 CRITICAL, 0 HIGH", but that was **wrong**: it confirmed the remediated-finding test *functions exist* without checking they actually **run** — 10 remote `TestQuickstart_*` scenarios are `t.Skip`-ped, so `make check` was green over a 0%-covered remote surface. The independent adversarial deep-dive (run immediately after) flipped the verdict to **3 CRITICAL + 1 HIGH**: the P1 remote-acquisition keystone is not shippable. Lesson recorded: **a test that exists ≠ a test that runs; grep `t.Skip` at checkpoint.** + +### Divergences + +| # | Severity | Type | Category | Artifact | Description | +|---|----------|------|----------|----------|-------------| +| 1 | MEDIUM | conscious | Deliverable partial | spec FR-023 | The **separate `skillrig-origin` repo** (the real PoC origin) still carries `skill.toml` (not migrated to `SKILL.md` frontmatter) and its `index.json` is not regenerated by `skillrig index`. In-repo test fixtures ARE migrated and tests pass; but **SC-001's "end-to-end against the real PoC library" cannot be fully demonstrated** until the origin-template repo is migrated. The doc-sync agent correctly left this (different git repo) and recorded it as the residual FR-023. | +| 2 | MEDIUM | oversight | DoD gap (Constitution IX) | constitution §IX; `.agents/skills/skillrig/` | The consolidated skill was updated (new `references/search.md` + `index.md`; updated `add.md`/`SKILL.md`/`verify.md`) but **trigger-accuracy evals were NOT run** (`.agents/skills/skill-creator/scripts/run_eval.py` exists). Constitution IX: "a feature is not complete until its skill coverage is verified." | +| 3 | LOW | conscious | Deferred doc edit | `.specledger/memory/constitution.md` | The stale §III (httptest/go-vcr, `skill.toml`-as-index-source) and §IX (`scripts/run_eval.py` path) wording is **not** amended — correctly deferred to a team-approved amendment (Governance). The touch-up list IS recorded in `docs/ROADMAP.md` + `docs/ARCHITECTURE-v0.md` (review finding C14), so it is tracked, not lost. constitution.md was NOT unilaterally edited (verified). | + +### DoD per User Story + +| US | Title | DoD (from acceptance scenarios) | Status | +|----|-------|---------------------------------|--------| +| US1 | Discover (search) | query over name+desc; topic filter (AND, exact); empty=exit 0; `--json` complete; help ≥2 examples | ✅ scenarios + tests present (SearchQueryMatchesNameDesc / FilterByTopic / OrderingDeterministic / EmptyResult / JSONComplete / ConventionMismatch / ConventionBoundary / HelpExamples) | +| US2 | Acquire remotely (add) | remote fetch, no local checkout → byte-identical vendor + lock(version/commit/treeSha); `verify` passes; idempotent no-op exit 0; force-on-divergence | ✅ AddRemoteNoLocalCopy / Idempotent / ForceOnDivergence / DryRun / HelpExamples | +| US3 | Reproducible pin | `--pin` recorded + reproducible; bare-semver==full-tag; non-existent → `NoSuchVersionError` | ✅ AddPinnedReproducible / AddPinTagFormEquivalent / AddPinNotFound (typed) | +| US4 | Trustworthy failures | convention / auth / unreachable distinct + actionable; `--verbose` raw cause | ✅ Classify* unit + AddAuth/PrivateNotFound/Unreachable/Verbose | +| US5 | Catalog generation (index) | generate from skills incl topics; deterministic; `index`==committed; not-in-origin + missing-version errors | ✅ IndexGenerates / Deterministic / MatchesCommitted / Malformed / NotInOrigin / MissingVersion | +| — | Skill co-evolution (IX) | skill updated AND trigger evals verified | ⚠️ skill updated; **evals not run** (Divergence #2) | + +### Verified faithful (adversarial spot-checks) + +- **C1** `CheckConvention(n)` is exact `n == supportedConvention` (==1) — lower/absent now fail; not the old `>`-only check. +- **C7** `GenerateCatalog` reads `convention_version` from `.skillrig-origin.toml` (not hardcoded). +- **S5/D8** `Search` is stdlib `slices.SortFunc` by relevance bucket (exact-name 3 / name 2 / topic 1 / desc 0) then name; no fuzzy/semantic. +- **C2** `NoSuchVersionError` is a distinct typed error (errors.go) used by fetch.go/add.go; rendered distinctly by cli. +- **C3** `resolvePin` expands bare `^v?SEMVER$` via the `name-vSEMVER` scheme, else literal ref/SHA. +- **S3/D4** `ResolveGitHubToken` = `GH_TOKEN`→`GITHUB_TOKEN`→`gh auth token --hostname` via `os.exec`; token via `git -c http.extraHeader`; GHE seam. +- **Exit codes:** new errors map (via `mapXError`→`UsageError`) to exit 1; only `*VerifyFailure`→2; exit 3 reserved. No leak. +- **No stubs:** zero `not implemented`/`TODO`/`panic(` in `pkg/skillcore`/`internal/cli` non-test. +- **No scope creep:** only `search`/`index` commands + `--pin`/`--topic` flags (all planned). +- `skill.toml` deleted; `manifest.go` on `gopkg.in/yaml.v3`. + +> **Adversarial note (why a deep-dive is still warranted):** this is a divergence review, not a line-by-line correctness audit of ~3,838 new lines. The 002 precedent is instructive — `make check`-green code there still harbored real bugs (idempotency when manifest name≠dir, skill-name path traversal, masked git failures, a `-cover` stderr leak) found ONLY by independent cold-context adversarial review. The generated `fetch.go`/`catalog.go`/`add.go` remote path and the new `file://` test substrate are exactly the kind of new surface that warrants it. + +### Issues Encountered & Resolutions +- `go get gopkg.in/yaml.v3` succeeded inside the workflow (the feared network block did not occur); `go.mod`/`go.sum` updated, build clean. +- Mid-flight LSP diagnostics (yaml import, undefined renderers) were transient — the make-check repair loop (Phase 7) closed them; final gate green. + +### Items Requiring Action Before Merge +1. **[MEDIUM] FR-023 residual** — migrate the `skillrig-origin` repo's `skill.toml`→`SKILL.md` frontmatter and regenerate its `index.json` via `skillrig index`, so SC-001 (real-PoC end-to-end) is demonstrable. +2. **[MEDIUM] Constitution IX** — run the skill trigger-accuracy evals (`run_eval.py`, model `sonnet`) for the updated `skillrig` skill; record results. +3. **[LOW] Constitution amendment** — land the team-approved §III/§IX touch-ups (list already in ARCHITECTURE/ROADMAP). +4. **[RECOMMENDED] Independent adversarial deep-dive** before merge (see note) — and/or a `/code-review` / Qodo pass on the PR. + +### Tests & Checks +- **Status: PASS** +- Commands: `make check` (gofmt + go vet + golangci-lint + go test) → **0 lint issues, all pass**; re-validated uncached: `go test -count=1 ./test/...` (3.27s), `./pkg/skillcore/...`, `./internal/...` all PASS. `go build ./...` clean. +- Scenario count: **60** `TestQuickstart_*`. +- Failures: none. + +### Uncommitted Changes +- None (HEAD `429f0ca`, working tree clean). + +--- + +## Adversarial Deep-Dive: 2026-05-31 (independent cold-context Opus, HEAD `4bcd7f9`) + +**Flipped verdict: the P1 remote-acquisition keystone is unimplemented-as-shippable and untested.** `make check` + `go test -count=1 ./...` pass ONLY because every remote scenario is `t.Skip`-ped (0% coverage on the fetch path). The `index`, manifest-migration, local-path-`add`, and `search`-against-a-local-checkout surfaces are genuinely working and well-tested. All findings below independently verified by this session before recording. + +| # | Sev | Type | Verified | Issue (file:line) | +|---|-----|------|----------|-------------------| +| C1 | CRITICAL | oversight | ✅ | `internal/cli/search.go loadCatalog` reads `os.ReadFile(repoRoot/OWNER/REPO/index.json)` only — **`search` never fetches a remote catalog**. FR-001/A2/contract step 2 unmet; the remote `init→search` workflow cannot run. | +| C2 | CRITICAL | oversight | ✅ | `pkg/skillcore/fetch.go cloneURL` hardcodes `https://github.com/OWNER/REPO.git`; `internal/config/origin.go originPattern` rejects `file://`. The S4 `file://` substrate is unwireable, so **all 10 remote `TestQuickstart_*` are skipped** (`remoteSubstrateBlocked`); `FetchSkill/FetchSparse/Clone/acquireRemote/resolvePin/classifyFetchError` are 0% covered; the named lock-treeSha==raw-git oracle does not exist. | +| C3 | CRITICAL | oversight | ✅ | `fetch.go classifyFetchError`: when `req.Pinned`, **any** NotFound is rewritten to `NoSuchVersionError` — a missing/private *repo* with `--pin` reports "no such version" (reproduced). Conflates the two classes FR-015/C2 require distinct. | +| H1 | HIGH | oversight | ✅ | `acquireRemote` never calls `CheckConvention` — remote `add` skips the convention gate (FR-016 "both search and add"); `mapConventionError` is dead code from add's path. | +| H2 | HIGH | oversight | ✅ | `resolvePin` 0% covered; SC-004 two-pin-form equivalence asserted nowhere (test skipped). | +| H3 | HIGH | oversight | ✅ | `classifyFetchError` not-found→authenticate-hint refinement + non-`*GitError` passthrough untested. | +| M1 | MEDIUM | oversight | ✅ | `manifest.go Require` has only `yaml:` tags → `encoding/json` emits PascalCase (`"Tool"/"Version"`) in `index.json` AND `search --json`; data-model §2 specifies lowercase. `jq '.skills[].requires[].tool'` → null. **`IndexMatchesCommitted` is a circular oracle** (producer==committed fixture, both wrong) and cannot catch it. | +| M2 | MEDIUM | oversight | ✅ | Remote not-found always sets `Skill`, so a missing *origin repo* renders as "skill X not found"; `OriginNotFoundError` only on the local-path form. | +| L1 | LOW | oversight | ✅ | Commit-SHA `--pin` likely unfetchable — `fetchSparseInto` clones tips then `checkout `; an arbitrary SHA not on a tip won't checkout (no `fetch `). | +| L2 | LOW | oversight | ✅ | `AuthError`/`UnreachableError` built with empty `Origin` in `ClassifyGitError`, never repopulated — latent trap. | + +**Confirmed-good (independently):** manifest→`SKILL.md` frontmatter did not regress 002 (local add/verify/idempotent exercised live); exact-match convention `== 1` (0/absent/higher fail); two-layer + single-`skillcore`-fetch/tree-SHA (AP-04) hold; skill co-evolution done; `search`/`add`/`index --help` each ≥2 examples. + +**Root cause:** no `file://`/local-path **origin** seam (`originPattern` rejects it; `cloneURL` hardcodes GitHub). This blocks the S4 test substrate AND means the D3 explicit-local-path-origin mode (FR-011) is not actually configurable. Fixing this seam unblocks the 10 tests and underpins C1/H1. + +**DoD status:** US1 PARTIAL (local half works; remote `search` unbuilt), US2 NOT MET (all skipped + H1), US3 NOT MET (skipped + C3), US4 PARTIAL (unit stderr-map only; integration skipped + C3), US5 MET (modulo M1 circular oracle). + +### Items Requiring Action Before Merge (REVISED — blocking) +1. **[CRITICAL/root] file://-or-local-path origin seam** — accept a `file://`/path origin (`origin.go`) and derive the clone target from it (`cloneURL`/`fetch.go`); this also delivers FR-011 local-path mode. Unblocks the test substrate. +2. **[CRITICAL] C1** — implement remote catalog fetch in `search` (fetch `index.json` per call when the origin is remote). +3. **[CRITICAL] C3 + M2** — only classify `NoSuchVersionError` when the repo/skill exists but the ref does not; preserve repo-vs-skill-vs-version distinction. +4. **[CRITICAL] C2/H2/H3** — un-skip the 10 remote scenarios; wire them to the `file://` bare-repo substrate with the raw-git treeSha oracle; cover `resolvePin`/`classifyFetchError`. +5. **[HIGH] H1** — gate `skillrigConvention` in remote `add`. +6. **[MEDIUM] M1** — add `json:` tags to `Require` (lowercase); regenerate the committed `index.json` fixture; break the circular oracle (assert lowercase keys + a non-self fixture). +7. **[LOW] L1/L2** — commit-SHA pin fetch strategy; populate `Origin` on Auth/Unreachable. + +### Tests & Checks (deep-dive) +- `make check`: PASS (0 lint). `go test -count=1 ./...`: PASS — **but 10 remote `TestQuickstart_*` are `t.Skip`-ped**, so "green" overstates readiness; the P1 remote surface is neither exercised nor reachable from the binary. `pkg/skillcore` coverage 72% overall, fetch path ~0%. + +### Process lesson +- Checkpoint missed this by checking test-function *existence*, not execution. **Add `grep -rn "t.Skip" test/` to the checkpoint routine**; treat any skipped acceptance scenario as an unmet DoD, not a pass. + +--- + +## Remediation + Re-review: 2026-05-31 (HEAD `5f90a82`) + +**Remediation** (focused workflow, 4 phases, HARD gate = no skipped acceptance test + the 10 remote scenarios MUST run+pass) closed all deep-dive findings. **Independently verified:** `make check` 0 lint, `go test -count=1` pass, **11 remote `TestQuickstart_*` RUN+PASS, 0 skipped acceptance scenarios**, fresh `skillrig index` emits lowercase + unescaped `requires`. Fixes confirmed real in code: `file://`/local-path origin form (`origin.IsLocal()`), remote `search` fetch (`FetchCatalog`), `gateRemoteConvention` on add (H1), distinct `NoSuchVersionError` on a failed pin *checkout* only (C3), `json:` tags + `SetEscapeHTML(false)` (M1), de-circularized `IndexMatchesCommitted`. The verify gate even caught its own incomplete `>=` escaping fix and finished it. Origin repo re-indexed clean (`79d23f4`). + +**Independent cold re-review** (Opus, HEAD `5f90a82`) — **verdict: "the remote keystone is now GENUINELY built and HONESTLY tested. The remediation is real, not theater."** The central suspicion (is the `file://` substrate a real clone or a working-tree masquerade?) was cleared: real `git clone --bare` over `file://`, RAW-git tree-SHA oracle, no circular oracle; the reviewer hand-reproduced the full `search→add→verify` round-trip + every error class. 4 new findings, **0 CRITICAL / 0 HIGH**: + +| # | Sev | Type | Disposition | +|---|-----|------|-------------| +| F1 | MEDIUM | oversight | Remote `add` convention gate correct (verified) but lacked an acceptance test → **fixing** (`TestQuickstart_AddRemoteConventionMismatch`). | +| F2 | MEDIUM | oversight | Token-injection argv (`-c http.extraHeader=…`) verified by inspection only → **fixing** (unit test of `authConfigArgs` format + `Clone` argv ordering; security-relevant). | +| F3 | LOW | conscious-by-omission | A missing/unreachable `file://`/local origin renders generically (not `UnreachableError`) — `ClassifyGitError` matched only GitHub-shaped stderr → **fixing** (add local-git stderr anchors). | +| F4 | LOW | **conscious — WON'T-FIX (documented)** | Asymmetric convention gating: a remote `OWNER/REPO` *with* a 002 local checkout is convention-gated on `search` but not on `add` (which takes the ungated 002 `acquireLocal` path). **Accepted:** consistent with 002 (which had no catalog), the local checkout is operator-controlled, and the remote-fetch `add` (the FR-006 path) IS gated. Revisit only if the local-checkout `add` form gains a catalog dependency. | + +**Confirmed-good by the re-reviewer:** no dead code, option-injection guards present, `file://` vs bare-path handling correct, idempotent/force/dry-run over remote correct, 002 local-path suite (22 scenarios) intact, https remote path intact. + +**Disposition:** F1/F2/F3 being closed in a focused follow-up (green-gated: make check + the new tests run+pass + no new skips). F4 accepted as above. After that → PR-ready; remaining non-blocking items are the deferred Constitution IX skill evals, the team-approved constitution amendment, and pushing the `skillrig-origin` repo. + +--- diff --git a/specledger/003-search-remote/spec-tech.md b/specledger/003-search-remote/spec-tech.md new file mode 100644 index 0000000..f707396 --- /dev/null +++ b/specledger/003-search-remote/spec-tech.md @@ -0,0 +1,108 @@ +# Technical Companion — `003-search-remote` (Discover & Acquire) + +**Status:** Draft — input to `/specledger.plan`. Companion to [spec.md](./spec.md) (the user-facing contract). +**Purpose:** capture every implementation-level decision surfaced during specify, and enumerate the **seven open decisions deliberately deferred to `/specledger.clarify`**. spec.md stays non-technical; this file is where transport, authentication, fingerprint, and test-tier mechanics live. Mirrors how 002 used `spec-tech-spike.md`. + +> Binding docs (do not contradict): `docs/ARCHITECTURE-v0.md` §2 (command surface), §2d (origin discovery + convention-version contract), §4.2 (treeSha = label-honesty, commit = provenance), §8b.2 (auth resolution), §9 (`index.json` + search), §9b (`OWNER/REPO[/path]@ref` identity grammar, immutable pins); `docs/design/cli.md` (search = **Query** pattern, remote add = **Vendor Mutation** pattern; errors-as-navigation; two-level output; standard flags; exit codes). Single-resolver rule (`config.ResolveOrigin`) and single-`skillcore` rule (AP-04 / AP-06) hold: remote fetch + catalog parsing get **exactly one** implementation in `pkg/skillcore`, shared by `search` and `add`. + +--- + +## 1. What ships in this slice + +Three commands — two consumer-side over the resolved origin, plus one origin-side generator (added by Spike S2): + +| Command | cli.md pattern | Exit codes | New surface | +|---|---|---|---| +| `skillrig search [QUERY...] [--topic T ...]` | Query | 0 (incl. empty result), 1 (usage/config) | brand new — query-first (token-AND substring over name+desc+topics), deterministic order; `--topic` filter (S5/D8) | +| `skillrig add [--pin ] [--dry-run] [--force] [--json] [--verbose]` | Vendor Mutation | 0 (vendor or idempotent no-op), 1 (usage/config) | extends 002's local-copy `add` with a **remote** acquisition path + `--pin` | +| `skillrig index [--out ]` | (origin-side generator) | 0 ok, 1 (usage/config) | brand new — **added IN-SCOPE by Spike S2** (single-tip catalog generator; the producer `search` consumes) | + +The consumer commands resolve the origin through `config.ResolveOrigin` (precedence: `SKILLRIG_ORIGIN` env > project `.skillrig/config.toml` > global) and must read/honor the origin's **convention version** before acting (see §4). `skillrig index` runs **inside the origin repo** (in its `index.yml` CI on merge to `main`), walking `skills/*/SKILL.md` and emitting `index.json` via the **same** `ParseManifest` the consumer commands use (AP-04). It is scope-bounded to the single-tip generator — no tag-history, no GC (S2). + +Also in-scope by S1: the **manifest migration** to agentskills.io frontmatter + `metadata.x-skillrig.*` (drop `skill.toml`) — commit 1 of 003, the field-source for both `index` and `add`/`verify`. + +Out of scope (reserved): verification-failure exit 2 and prerequisite exit 3; `bump`; multi-client symlink materialization; `doctor`; the policy/allowlist enforcement in `policy.toml` (v1 governance); cross-ref/version-history aggregation in the catalog (D-S2-tip); GitHub Enterprise auth (S3). + +## 2. Ground truth — the real PoC origin + +`github.com/skillrig/origin-template`, checked out at `/Users/vincentdesmet/specledger/skillrig/skillrig-origin`. Relevant artifacts as they exist today: + +- **`.skillrig-origin.toml`** — `convention_version = 1`, `origin = "my-org/my-skills"`, `skills_dir = "skills"`, `cmd_dir = "cmd"`, `tag_scheme = "name-vSEMVER"` (a skill version is tagged e.g. `terraform-plan-review-v1.4.0`). +- **`index.json`** (the catalog) — carries `skillrigConvention: 1`, `origin`, and `skills[]` with `name / version / namespace / description / tags / path / requires`. **No per-skill `treeSha` or `commit`** — the catalog is discovery-only. +- **`skills/terraform-plan-review/`** — `skill.toml` (full manifest incl. `[[requires]]`) + `SKILL.md`. The single sample skill. +- **`scripts/build-index.sh`** — a documentation-grade catalog emitter that **emits a reduced schema** (`name/version/description/path` only — drops `topics`, `namespace`, `requires`). This **drifts** from the committed `index.json` and from what `search --topic` needs. Reconciling this is FR-023 (scoped by Spike S2, §8b). +- **`policy.toml`** — external-source allowlist (v1 governance; not consumed here). + +**Schema the CLI will consume for `search`** (must be the reconciled, full shape): per-skill `name`, `version`, `description`, `topics[]` (renamed from `tags` — S5/D8), `path`; catalog-level `skillrigConvention`, `origin`. `namespace` and `requires` may be carried but are not required by `search` this slice. + +## 3. The seam being replaced (from 002) + +002's `add` treats the origin as a **local checkout** at `/OWNER/REPO`: + +- `internal/cli/add.go:139` `originDirRef(origin)` maps a resolved `config.Origin{Owner,Repo,Ref}` → a local directory `OWNER/REPO` (anchored at repo root after the AR-1 fix) + ref. +- `pkg/skillcore` then reads the skill subtree from that local dir, computes the git tree-SHA, and vendors byte-identically. + +Remote `add` introduces a **fetch step in front of that subtree read**: instead of requiring the directory to already exist, fetch the skill's content (and the catalog, for `search`) from `github.com/OWNER/REPO@ref`. The byte-identical vendoring + lock-write that follow are unchanged and stay in `pkg/skillcore` (AP-04). + +`config.Origin` already has `Owner / Repo / Ref` and parses `OWNER/REPO[@REF]` — no config-schema change needed. `Ref` is the origin-level **branch** pointer; a per-skill `--pin` is a separate immutable tag/SHA (§5, US3). + +## 4. Convention-version gate (cross-cutting, both commands) + +Architecture §2d.3: the generic binary speaks a **convention contract**; it must check the origin's `convention_version` (mirrored as `skillrigConvention` in the catalog) and **fail clearly** on an incompatible origin rather than misbehaving (FR-016). This binary supports convention `1` with an **exact-match policy** (decided 2026-05-31, review C1): `skillrigConvention == 1` passes; **any other value — higher, lower, or absent/`0` — fails** `IncompatibleConventionError` (FR-016). Exact-match is the YAGNI v0 choice; an `N`/`N-1` compatibility window is a deliberate future change (architecture Q14), not a `>`-only check that silently lets older/missing conventions through. The check happens once, through the shared core, for both `search` and `add`. + +## 5. Identity, fingerprint, and pins + +- **Provenance = `commit`**, **label-honesty = `treeSha`** (architecture §4.2). At remote-add time: fetch the skill subtree at the resolved commit, record that **commit** (provenance) and the **git tree-SHA computed from the fetched subtree** (label-honesty) into the lock — using the *same* `skillcore` tree-SHA code `verify` recomputes (R9/R14/N2). `verify` then checks on-disk content against the recorded tree-SHA, offline, exactly as today. +- **The origin publishes no per-skill tree-SHA** (the catalog has none). So label-honesty here means *"the on-disk content still matches what was vendored,"* anchored by provenance (you fetched it from the origin at a specific commit), **not** *"matches an origin-attested hash."* **Resolved (D-treesha, §8a):** this framing is accepted for v0; publishing per-version tree-SHAs is deferred. +- **Pin (`--pin `)** — an immutable tag or SHA per skill, distinct from the origin-level `@ref` branch. `tag_scheme = "name-vSEMVER"` ⇒ a version pin maps to tag `-v` (e.g. `--pin v1.4.0` → `terraform-plan-review-v1.4.0`, or accept the full tag). **Resolved (D-pin, §8a):** the lock records commit + treeSha + the resolved human-readable version/tag, so re-acquisition reproduces byte-identical content (FR-013/014, SC-004) *and* humans can reason about versions. A non-existent pin → distinct "no such version" (FR-015). + +## 6. Failure taxonomy (errors-as-navigation, FR-016–019) + +Three confusable classes must be **distinct typed errors** in `pkg/skillcore`, rendered with what/why/fix by `internal/cli`: + +1. **Incompatible convention** — origin's convention version unsupported (FR-016) → "update skillrig" class. +2. **Authentication** — private origin, no/invalid credentials (FR-017, R18) → distinct from not-found; point at how to authenticate. The top onboarding/CI footgun; surface loudly. Must be distinguishable from "origin not found" (a missing/typo'd repo) and "skill not found" (valid origin, no such skill). +3. **Unreachable** — network failure / wrong location (FR-018) → distinct from auth and compatibility. + +Plus the existing 002 classes carried forward: skill-not-found (vs origin-not-found, the AR-2/R2-M4 distinction), invalid-skill-name (path-traversal guard), overwrite-on-divergence. `--verbose` surfaces the raw underlying cause on every command (never swallowed). + +## 7. Test tier — the new network boundary + +002 had **no network boundary** (no `httptest`/go-vcr). This slice introduces one, so the test substrate is a first-class decision (**Spike S4**, §8b). Leaning: a **local bare git repo** acting as the "remote" origin (fixtures bootstrap it in a tmpDir, the same way 002 bootstrapped a local origin), exercised over `file://`/local-remote git transport — keeps the suite offline and deterministic while running the *real* fetch path. A ground-truth test must assert the **fetched tree-SHA == raw `git` tree-SHA** of the origin subtree (the §III ground-truth discipline, extended across the fetch). `TestQuickstart_*` scenarios exercise: remote add with no local copy, idempotent re-add, pinned reproducibility, each failure class, and search filtering/empty-result/json-completeness. Avoid coupling to live GitHub in the gate. + +## 8. Decisions & spike backlog (clarify session 2026-05-31) + +The specify phase's seven open decisions resolved into **firm decisions** (below) plus **four spikes** (§8b). The reviewer judged four areas to carry enough uncertainty that committing now risks rework, so they are time-boxed spikes that must run **before** `/specledger.plan`. + +### 8a. Firm decisions + +- **D-local (was OD#1) — origin classification.** The tool **never** creates or caches a local copy of a remote origin. The origin is *either* a remote `OWNER/REPO` (fetched) *or* an explicitly-configured local filesystem path; there is **no "both present" precedence**. Confirmed against the 002 code: `originDirRef` overloaded `OWNER/REPO` as a directory `/OWNER/REPO` and ran `git -C` on that **local working tree** (no `file://`, no remote). 003 must split the two forms cleanly: a path-shaped origin → operate on a local checkout (002 behavior, generalized to a real path rather than `/OWNER/REPO`); a bare `OWNER/REPO` → fetch remotely. *(The "repo root" in §3 is the **consumer** repo root.)* +- **D-pin (was OD#5 partial) — what the lock records.** Record **commit** (provenance) + **treeSha** (label-honesty) + the **resolved human-readable version/tag** of the pin — even if that tag is later rewritten upstream. A commit is opaque to humans; the tag carries the versioning scheme they reason about (older/newer). This extends 002's lock shape (see open question for S2: how this interacts with catalog tag-aggregation across refs). +- **D-treesha (was OD#5) — trust anchor.** Confirmed: `commit` = provenance, `treeSha` = computed-at-add (the origin publishes none), so label-honesty = "matches what was vendored," not "matches an origin-attested hash." Publishing per-version tree-SHAs is deferred (not this slice). +- **D-catalog-fetch (was OD#3) — freshness.** Fetch the catalog **per `search` call**; no offline cache this slice; unreachable origin → the unreachable error (FR-018). (Caching is a later optimization.) +- **D-convention (naming clarity).** `convention_version` (mirrored `skillrigConvention`) is the **origin-contract / schema-compatibility** version the binary gates on — it is *not* a content version, and there is no separate `catalog.version`. Keep the name; it answers "does this binary understand this origin's layout?" + +### 8b. Spike backlog — run before `/specledger.plan` + +Each spike is time-boxed and writes to `specledger/003-search-remote/research/2026-05-31-.md`; findings fold back into this file and may revise the assumptions above. + +- **S1 — Skill manifest format (was OD#6 input; the big one).** *Question:* keep `skill.toml` (002) or move skill metadata into agentskills.io **frontmatter** (`SKILL.md`) with skillrig extensions? **Working hypothesis (strong, from project history):** *drop `skill.toml`* and extend the agentskills.io frontmatter — use standard fields (`name`, `description`, `license`, `compatibility`, `allowed-tools`) verbatim (`allowed-tools` is the natural home for backing-CLI `requires`), and put skillrig-specific data under namespaced keys in the standard's free-form `metadata` section (`x-skillrig.tags`, `x-skillrig.requires`, `x-skillrig.convention-version`). *Why it matters here:* the catalog `search` reads is **generated from this metadata**, so the manifest format is the field-source for the catalog. *Rationale recap:* the original "two audiences / travels-with-skill / offline-doctor / TOML-nicer-than-YAML" reasons for a sibling file don't survive scrutiny (frontmatter is more atomic, the standard's `metadata` field exists *specifically* for ecosystem extensions, and aligning preserves portability across 26+ agentskills.io clients); the only real cost is YAML-vs-TOML cosmetics + coupling to the standard (already incurred via R20–R21). *Spike validates:* how the Go `gh` CLI parses frontmatter; whether `allowed-tools` truly expresses `[[requires]]` (version constraint + private source); the `x-skillrig.*` shape; and the **migration scope** (002's `ParseManifest`/`skill.toml` reader, the origin template's `skill.toml` files, architecture §4.1) — i.e. whether 003 proceeds on frontmatter directly or a small "manifest reframe" must land first. Also fold in prior-art on search-by-topic (`gh skill`, Vercel `sl`/skills registry) — comment 6c26e223. + - **✅ RESOLVED 2026-05-31** — research/`2026-05-31-skill-manifest-format.md`. **Decision: migrate to agentskills.io frontmatter + `metadata.x-skillrig.*`; drop `skill.toml`.** Evidence: the standard's `metadata` map is the sanctioned extension mechanism (the spec's own example even puts `version` under `metadata`); the Go `gh` CLI does exactly this in production (`internal/skills/frontmatter/frontmatter.go`, `gopkg.in/yaml.v3`, flat dotted-prefix keys like `metadata.github-tree-sha`). **Correction to the hypothesis:** `allowed-tools` does **NOT** carry `requires` — it's a space-separated string of agent-permission tool invocations and gh rejects an array form; so `version`, `tags`, `namespace`, `convention-version`, **and** `requires` all live under `metadata.x-skillrig.*`. **Catalog field-source:** `index.json` is generated FROM frontmatter — `name`/`description` from standard fields, the rest from `metadata.x-skillrig.*`, `path` from the dir. skillrig's generated, tag-filterable catalog is a deliberate divergence from gh (which has no catalog — it does code-search + a repo topic, no `--tag`); keep it, re-point the generator at frontmatter. **Migration: SMALL, in-slice → commit 1 of 003.** Only parser is `pkg/skillcore/manifest.go` (~47 lines, single caller `add.go:91`); `verify.go`'s `isSkillDir` already accepts `SKILL.md`; the rest is fixtures + the origin-template `skill.toml`→frontmatter fold (already FR-023). Do **not** spin a separate "reframe" feature — land it first so remote fetch + catalog-parse are written against the new format once. **Risks (carry into plan):** (1) `x-skillrig.requires` is a nested list, bending the spec's string→string letter — gh's `interface{}` parses it, namespaced; validate against `skills-ref validate`, fall back to a JSON-string if a strict validator rejects it; (2) adds `gopkg.in/yaml.v3` (unavoidable — frontmatter IS YAML; matches gh); (3) the PoC origin currently **duplicates `name`/`description`** across `SKILL.md` + `skill.toml` — a latent drift bug the migration removes. +- **S2 — Catalog generation & lifecycle (was OD#6).** *Question:* who generates/maintains the catalog, and how? skillrig is the single tool for end-users **and** origin maintenance, so "consume-only + roadmap a generator" may be a false economy. *Spike covers:* catalog generation from S1's metadata source; **cross-ref tag aggregation** (does the catalog at `origin@ref` aggregate skills/tags across older refs/releases? comment 8e05b856 reply — strongly shapes indexing); **append-only vs full-regenerate** as release-please cuts per-skill tags; **garbage collection** (or YAGNI); and whether `skillrig index` belongs in 003's scope or a sibling feature. Output: the catalog contract + the scope decision for FR-023. Depends on S1 (the field-source). + - **✅ RESOLVED 2026-05-31** — research/`2026-05-31-catalog-generation-lifecycle.md`. Ground truth: the origin's `.github/workflows/index.yml` is **`push: main` (paths `skills/**`) triggered, not release-triggered**; it full-regenerates `index.json` from the **HEAD tree** and commits if changed. **D-S2-tip — single-tip, NOT cross-ref aggregated:** the catalog at `OWNER/REPO@ref` reflects only the branch tip (one version per skill = HEAD version; the manifest `version` is kept == the latest released tag by release-please). Version *history* lives in git tags, reached by `add --pin ` fetching the tag subtree directly (S4) — the catalog is **never** the version-history index. (Aggregation would need tag-walking, grow unbounded, and break the single root `skillrigConvention`.) Removed-at-HEAD skills correctly drop from `search` while already-vendored consumers stay fine (their lock is offline-verifiable). **D-S2-regen — full-regenerate, GC is YAGNI:** `index.json = f(HEAD frontmatter)`; nothing accumulates. **D-S2-scope — `skillrig index` ships IN 003, not a sibling** (the consequential reversal of the original "consume-only" lean): `search` is useless against a drifting catalog, and `build-index.sh` *provably drifts* (emits only `name/version/description/path`, drops `tags`/`requires` — that is FR-023). The hard part (the frontmatter parser) is **already built in 003 by S1**, so `skillrig index` is a thin `walk skills/*/SKILL.md` + shared `ParseManifest` + marshal — satisfying AP-04 by construction — and `index.yml` is already authored to call it (`command -v skillrig … skillrig index --out`). Scope-bounded to the **single-tip generator** (no tag-history, no GC). **Contract test:** `skillrig index` over the origin fixture MUST equal the committed `index.json` (producer == artifact, mirroring S4's tree-SHA oracle). **FR-024 doc fix:** architecture §9's "on release" wording is stale — it's "on merge to main." +- **S3 — Auth / token resolution (was OD#4).** *Question:* how does skillrig obtain a token to fetch a private origin? **Direction (decided):** `os.exec` of `gh`/`git` — **not** `gh`-as-a-library (too heavy). *Spike covers:* mise's token-resolution path (mise is open-source **Rust** — `git clone` to a temp dir to study it), and the already-checked-out Go `gh` CLI at `/Users/vincentdesmet/specledger/skillrig/gh-cli` (how it resolves/uses tokens via `os.exec` rather than as a vendored lib); confirm the precedence skillrig should mirror. **Defer GitHub Enterprise hosting** to roadmap/backlog (note it, don't build it). Output: the token-resolution order + the auth-failure detection/wording for FR-017. + - **✅ RESOLVED 2026-05-31** — research/`2026-05-31-auth-token-resolution.md`. **Token order (3 steps, via `os.exec`):** (1) `GH_TOKEN` env → (2) `GITHUB_TOKEN` env → (3) `gh auth token --hostname github.com` (exit 0 + non-empty stdout = token; non-zero = no session, skip not fatal; `gh` absent = skip silently). `git credential fill` **deferred** (the `gh` path already covers keyring tokens mise reads from `hosts.yml`). **Failure classification — all three classes exit 128, distinguished by stderr** (feeds S4's typed errors): `Authentication failed` / `Invalid username or token` → **AuthError** (FR-017); `repository '…' not found` → **NotFound**; `Could not resolve host` / `Failed to connect` → **UnreachableError** (FR-018). **Private-repo subtlety:** GitHub returns *not found* (not 403) for a private repo with no/bad token — so "not found" + no token resolved MUST add the hint *"if this is a private origin, authenticate via `gh auth login` or set GITHUB_TOKEN."* **Token injection:** pass via `git -c http.extraHeader="Authorization: Basic …"`, **not** embedded in the clone URL (avoids history/process-listing leakage). **GHE:** design the seam as `ResolveGitHubToken(hostname string)` so Enterprise is a one-line extension later. **Doc-correction for FR-024:** architecture §8b.2's claimed mise precedence (`credential_command > MISE_GITHUB_TOKEN > …`) is inaccurate — mise's real order puts env vars first; this doesn't affect skillrig (it never uses mise's `credential_command`), but the architecture claim should be fixed. +- **S4 — Remote-git testing (was OD#7). ✅ RESOLVED 2026-05-31** — research/`2026-05-31-remote-git-testing.md`. **Three-tier substrate:** (1) **happy/integrity** → `file://` + a local **bare** repo in `t.TempDir()` (bootstrap the fixture working tree as 002 does, push to a bare, point the CLI at `file://`) — runs the real `git clone --sparse` offline; ground-truth assertion `fetched treeSha == rawTreeSHA(fixture,"HEAD","skills/")` (oracle-independence D11 carried across the fetch). (2) **FR-017 auth / FR-018 unreachable / transient** → **extend the EXISTING stub seam** (`gitClient.commandContext` in `pkg/skillcore/git.go`, the `TestHelperProcess` re-exec) to the new `Clone`/`FetchSparse` methods and inject `(exitCode=128, stderr=…)` — these stay `pkg/skillcore` unit tests; **no `httptest`, no new seam needed**. (3) **Reject** a real git-over-HTTP `httptest` server (smart-HTTP/CGI handshake is fragile, OS-specific, unnecessary). *Not coverable offline (future E2E/manual, not gate-blockers):* GitHub's real auth handshake, mid-stream TCP abort, HTTP 429. **Action for 003:** add **`AuthError`** + **`UnreachableError`** as distinct typed errors in `pkg/skillcore/errors.go`, classified from `GitError.Stderr` **inside the fetch layer** before returning to `cli` (keeps the §6 taxonomy in `skillcore`, AP-04). +- **S5 — Search algorithm, index storage, terminology. ✅ RESOLVED 2026-05-31** — research/`2026-05-31-search-index-architecture.md`. **Query-first:** `search [QUERY...]` is a case-insensitive **token-AND substring** over `name+description+topics`; `--topic` is a separate exact-string filter; **deterministic order** = relevance bucket (exact-name>name>topic>description) then lexicographic `name` (N6; no fuzzy/semantic/TF-IDF). **Storage:** keep the flat `index.json` (S2) + in-memory filter — index structures are YAGNI below ~10k docs; reject inverted/binary indexes (git-hostile). **Dependency: stdlib only** (`strings`/`slices`); `bleve` rejected; escape hatch `github.com/sahilm/fuzzy` (pure-Go) only if forgiving matching is ever wanted post-v0. **Terminology: `tag`→`topic`** (`--topic`, `topics[]`, `metadata.x-skillrig.topics`) — agentskills.io defines no such field, GitHub "topics" reinforces it, removes the git-tag/version-pin collision; flag is `--topic` not `--filter` (YAGNI). Matcher in `pkg/skillcore` (AP-04); no new catalog field. **Required artifact changes (done):** `tags`→`topics` across spec/data-model/contracts/quickstart + a free-text `[QUERY]` FR (FR-002/002a) with the stated ordering; origin-template frontmatter + regenerated `index.json` fold into the S1 migration commit. + +> **Decision still implicit (fold into S4/plan): fetch transport** (was OD#2) — shell `git` partial-clone + sparse-checkout vs raw HTTPS for the catalog. *Leaning (strong):* shell `git` for the subtree (keeps the "shell git, no in-process hashing dep" stance and makes the tree-SHA ground-truth trivial); decide catalog fetch (sparse single-file vs raw GET) alongside S3 (auth uniformity) and S4 (testability). + +## 9. Co-evolution work items (this branch touches two repos + docs) + +- **FR-023 — origin template (`skillrig/origin-template`):** reconcile `scripts/build-index.sh` and the committed `index.json` so the catalog carries every field `search` consumes (notably `topics`); record the schema as the convention-1 catalog contract; note the `skillrig index` follow-up. Track whether these origin-repo edits are part of this branch's PR or a sibling work item. +- **FR-024 — `docs/ROADMAP.md` + `docs/ARCHITECTURE-v0.md`:** record the divergences — (a) roadmap 003 + 004 ship as **one** combined slice; (b) the 002 local-checkout seam is superseded by / now coexists with real remote acquisition; (c) the catalog schema is pinned to what `search` consumes. Per CLAUDE.md, a CLI behavior change updates `docs/design/cli.md` in the same branch — add the `search` command (Query pattern) and the remote `add` surface (incl. `--pin`). +- **Skill co-evolution (constitution IX):** extend the single consolidated `skillrig` skill — add `references/search.md`, update `references/add.md` for the remote path + `--pin` + the new failure classes, and update the root routing/description keywords. Run trigger evals (`model: "sonnet"` per global instructions). + +## 10. Decision integrity (carried from 001/002, must stay consistent) + +single origin resolver (`config.ResolveOrigin`) · single `skillcore` (no parallel fetch/hash impl) · shell-`git` tree-SHA · `pkg/skillcore` as the public SDK boundary · byte-identical vendoring · idempotent no-op = exit 0 · refuse-overwrite-on-divergence (prompt/`--force`) · errors-as-navigation with `--verbose` raw cause · two-level output (human compact + complete `--json`) · exit 2/3 reserved (not emitted here) · path-traversal + symlink guards from the 002 Qodo round still apply to remotely-fetched content. diff --git a/specledger/003-search-remote/spec.md b/specledger/003-search-remote/spec.md new file mode 100644 index 0000000..b5d73cb --- /dev/null +++ b/specledger/003-search-remote/spec.md @@ -0,0 +1,234 @@ +# Feature Specification: Discover & Acquire Skills (`search` + remote `add`) + +**Feature Branch**: `003-search-remote` +**Created**: 2026-05-30 +**Status**: Draft +**Input**: User description: "Combine roadmap 003 (search) and 004 (remote add) into one MVP slice: a user who has bound their repo to an origin can FIND a skill in the org's library and VENDOR it straight from the remote library — with no pre-existing local copy of the library." + +> **Technical companion**: [spec-tech.md](./spec-tech.md) holds every implementation-level decision (origin classification, fetch mechanism, catalog handling, authentication sources, fingerprint semantics, the new network test tier) and the **seven open decisions deferred to `/specledger.clarify`**. This spec stays user-facing; the companion is the input to `/specledger.plan`. Where this spec says "the library catalog" or "records the skill's exact identity," the companion names the concrete artifacts. + +## Overview + +The first two slices made a repo *self-describing about where its skills come from* (001) and made the skills it already carries *honest* (002 — `add` from a **local copy** of the library, plus `verify`). But to vendor a skill today, a user must already have the entire library checked out next to their repo. That is a developer-only workaround, not something an organization can adopt. + +This slice closes that gap with the smallest coherent "discover & acquire" loop: + +1. A user **finds** the skill they want by browsing or filtering their org's library (`search`). +2. A user **vendors** that skill directly **from the remote library** — no manual checkout, no copy step — and the tool records its exact identity so it can be verified later (`add`, extended to fetch remotely). + +After this slice, the everyday path is: `skillrig init` → `skillrig search` → `skillrig add ` → `skillrig verify`. That is the first end-to-end experience a real consumer can adopt. + +Remote acquisition is **additive**: vendoring from a local copy of the library (shipped in 002) keeps working unchanged, as a development, offline, and air-gapped path. + +## Clarifications + +The specify phase left seven decisions open. The clarify session below resolved them into **two firm decisions** and **four research spikes** (skill-manifest format, catalog generation/lifecycle, authentication, remote-git testing). **All four spikes are now complete** (writeups in [spec-tech.md](./spec-tech.md) §8b + `research/2026-05-31-*.md`); their outcomes are folded in and the assumptions below stand as revised. + +> **Readiness:** spikes complete — this feature is **plan-ready** (`/specledger.plan`). Two outcomes expand scope and must carry into planning: (a) **skills migrate to a single `SKILL.md` with standard frontmatter** (the separate manifest file is dropped) — landed as the first step of the work; (b) the tool **also generates the library catalog** (`skillrig index`), because discovery is meaningless against a hand-maintained catalog that drifts. See [Assumptions](#assumptions) A7–A8. + +### Session 2026-05-31 + +- Q: How should 003 treat the skill manifest format (`skill.toml` vs the agentskills.io frontmatter + `metadata` standard the catalog is built from)? → A: **Spike before deciding** — compare `skill.toml` vs agentskills.io frontmatter + `metadata` namespacing (e.g. `skillrig.dev/requires`), toml/yaml fragility, and how the Go `gh` CLI parses frontmatter; lock the catalog field-source only after. (Spike S1) +- Q: Who generates and maintains the catalog `search` reads, and is that in 003's scope? → A: **Spike the catalog lifecycle first** — generation, cross-ref tag aggregation across origin refs, append-only vs full regenerate, and garbage-collection — because it strongly shapes how indexing works; scope the generation work after. (Spike S2) +- Q: Does the tool ever create/cache a local copy of a *remote* origin? → A: **No.** Acquisition fetches from the GitHub `OWNER/REPO`. A "local origin" exists **only** when the user configures a local filesystem path via `init` (and as a test fixture); there is no tool-managed cache and no "prefer local copy when both present" rule. (Confirmed against the 002 code: 002 overloaded `OWNER/REPO` as a directory `/OWNER/REPO` and ran `git` directly on that working tree — it never used `file://` or a remote. 003 splits the two forms: an explicit local path vs a remote `OWNER/REPO`.) This supersedes assumption A1 and aligns with fetch-catalog-per-search. +- Q: How is authentication to a private origin handled this slice? → A: **Spike mise's token-resolution path** (mise is open-source Rust — clone to a temp dir to study it) and the already-checked-out Go `gh` CLI source (`/Users/vincentdesmet/specledger/skillrig/gh-cli`); use `os.exec` of `gh`/`git` for the token, **not** `gh`-as-a-library (too heavy); defer GitHub Enterprise hosting to the roadmap/backlog. (Spike S3) +- Q: When a skill is acquired at a pin (tag/ref), what is recorded in the lock? → A: **commit + treeSha + the resolved human-readable version/tag** — provenance (commit) and label-honesty (treeSha) for the machine, plus the version/tag (even if later rewritten upstream) so humans can reason about older/newer. +- Q: How are the network-failure FRs (auth / unreachable / transient) tested, given 002 used a local git working tree with no `file://` and no remote? → A: **Spike remote-git testing** — `file://` (or a local bare repo) for the happy/integrity fetch path, plus `httptest` or a fault-injection seam for the network-error FRs that `file://` cannot reproduce; decide the seam before planning. (Spike S4) +- Q: S1 concluded the frontmatter migration adds a YAML-parsing dependency (`gopkg.in/yaml.v3`), contradicting the "no new dependencies" note — accept it? → A: **Accepted** — align with the agentskills.io standard; `gopkg.in/yaml.v3` is adopted (the same parser `gh` uses), recorded as a deliberate divergence in the FR-024 architecture update. +- Q: Is `search` only a topic filter, and does "tag" collide with git tags? → A: **Resolved by Spike S5** (`research/2026-05-31-search-index-architecture.md`). **`search [QUERY]` is query-first** — the positional argument is a free-text **query matched against name + description (+ topics)**, deterministic token-AND substring, ordered by a fixed relevance score then `name` (no fuzzy/semantic — N6). The label concept is **renamed "topic"** (`--topic` flag, `topics[]`, `metadata.x-skillrig.topics`) — the agentskills.io spec defines no tags/topics field, so nothing upstream breaks, and it removes the collision with **git tags** (version pins, `name-vSEMVER`). The flag is **`--topic`** (not a generic `--filter` — YAGNI, one dimension). **A flat `index.json` + in-memory filter is sufficient** (catalog is tens–hundreds of entries; an index structure earns its keep only at ~10k+ docs in a long-lived process) — **stdlib-only**, no search dependency. + +## User Scenarios & Testing *(mandatory)* + +### User Story 1 - Discover a skill in the org library (Priority: P1) + +A developer (or their agent) has bound the repo to an origin but does not know the exact name of the skill they want. They type a **free-text query** (e.g. `skillrig search terraform plan`) that is matched against each skill's name and description, optionally narrowing further by topic, and get back a short, scannable answer they can act on — including the exact name to feed to `add`. + +**Why this priority**: You cannot vendor what you cannot find. Discovery is the entry point of the loop and the lowest-risk half of the slice; on its own it already delivers value (an agent can enumerate available skills before deciding). It is independently demonstrable without any change to how skills are vendored. + +**Independent Test**: Bind a repo to a library that publishes a catalog of skills; run the search command with and without a topic filter; confirm the matching skills are listed (human-readable and machine-readable), that filtering is exact and repeatable, and that an empty result is reported as a clean "nothing matched" rather than an error. + +**Acceptance Scenarios**: + +1. **Given** a repo bound to a library that publishes two or more skills, **When** the user runs `skillrig search`, **Then** the tool lists every published skill with enough detail (name, version, one-line description) to choose one, plus a footer hint pointing to the next step. +2. **Given** the same repo, **When** the user runs `skillrig search` filtered to a topic that only some skills carry, **Then** only the skills carrying that topic are listed, and the result is identical on repeated runs. +3. **Given** the same repo, **When** the user filters to a topic no skill carries, **Then** the tool reports "no skills matched" and succeeds (it is not an error to find nothing). +4. **Given** any search, **When** the user requests machine-readable output, **Then** the output is complete (no truncation) and contains every field a downstream agent needs to call `add`. + +--- + +### User Story 2 - Vendor a skill directly from the remote library (Priority: P1) + +A developer (or their agent) has found the skill they want and vendors it with a single command. They do **not** first clone or copy the library anywhere — the tool fetches the skill's content from the remote library on their behalf, places it in the repo, and records its exact identity so the same content can be proven later. + +**Why this priority**: This is the keystone that turns skillrig from a local-path tool into one an organization can adopt. It completes the discover→acquire→verify loop and unblocks every later capability (upgrades, multi-client placement). + +**Independent Test**: From a repo bound to a remote library, with **no** local copy of that library present, run the add command for a published skill; confirm the skill's files appear in the repo identical to the library's, that an identity record is written, and that the existing `verify` command then passes against what was vendored. + +**Acceptance Scenarios**: + +1. **Given** a repo bound to a remote library and **no** local copy of it, **When** the user runs `skillrig add ` for a published skill, **Then** the skill's content is placed in the repo exactly as the library holds it and an identity record (which version, where it came from, and a fingerprint of the content) is written. +2. **Given** a freshly vendored skill, **When** the user runs `skillrig verify`, **Then** verification passes — the recorded identity and the on-disk content agree. +3. **Given** a skill already vendored at the same version and content, **When** the user runs `skillrig add ` again, **Then** the tool reports "already up to date" and changes nothing (a safe, repeatable no-op). +4. **Given** a skill already vendored but locally modified, **When** the user runs `skillrig add ` again, **Then** the tool refuses to silently overwrite and tells the user how to force it — matching the behavior vendoring from a local copy already has. +5. **Given** a request for a skill the library does not publish, **When** the user runs `skillrig add `, **Then** the tool reports the skill was not found in the library and suggests how to discover the correct name. + +--- + +### User Story 3 - Acquire a pinned, reproducible version (Priority: P2) + +A developer wants the acquisition to be reproducible: not "whatever the library's current tip happens to be," but an exact, immutable version. They pin the skill to a specific released version when vendoring, and that exact identity is recorded so a later acquisition (on another machine, in CI, months later) reproduces the same content byte-for-byte. + +**Why this priority**: Reproducibility is core to the product promise ("exactly the version that was reviewed and approved"), but the default path (US2) already records a verifiable fingerprint, so explicit pinning is a strengthening rather than a prerequisite. It can ship immediately after US2 or split out if it risks the MVP. + +**Independent Test**: Vendor a skill pinned to a specific released version; record the result; on a clean repo, vendor the same skill pinned to the same version; confirm the two results are byte-identical and carry the same recorded identity. + +**Acceptance Scenarios**: + +1. **Given** a library that has published more than one released version of a skill, **When** the user vendors it pinned to a specific version, **Then** that exact version's content is placed and its immutable identity is recorded. +2. **Given** a skill vendored at a pinned version, **When** another user vendors the same skill at the same pin on a clean repo, **Then** both repos hold byte-identical content with the same recorded identity. + +--- + +### User Story 4 - Trustworthy, navigable failures (Priority: P2) + +When acquisition or discovery cannot proceed, the developer (or their agent) gets an error that says what failed, the real reason, and what to do next — never a misleading or generic message. In particular, three confusable situations are kept distinct: (a) the tool is too old (or too new) for the library's format; (b) the user lacks permission to reach a private library; (c) the library cannot be reached at all. + +**Why this priority**: Errors-as-navigation is a binding principle of this CLI, and the auth-vs-not-found confusion is the single most common onboarding and CI footgun. Getting these distinct is what makes the remote path safe to hand to an agent. It is P2 only because the happy path (US1/US2) must exist first to fail against. + +**Independent Test**: Drive each failure independently — point the tool at a library whose format it does not support; attempt to reach a private library without credentials; attempt to reach an unreachable library — and confirm each produces a *distinct*, actionable message, and that a verbose mode reveals the underlying cause. + +**Acceptance Scenarios**: + +1. **Given** a library whose published format is newer (or otherwise incompatible) than this tool understands, **When** the user runs `search` or `add`, **Then** the tool fails clearly stating a compatibility mismatch and what to do (e.g. update the tool), rather than misbehaving or producing partial results. +2. **Given** a private library the user is not authenticated to, **When** the user runs `search` or `add`, **Then** the tool reports an **authentication** problem distinctly from "skill not found" or "library not found," and points at how to authenticate. +3. **Given** a library that cannot be reached (offline, wrong location), **When** the user runs `search` or `add`, **Then** the tool reports the library could not be reached and suggests the likely fixes, distinct from an authentication or compatibility failure. +4. **Given** any of the above, **When** the user re-runs with verbose output, **Then** the underlying raw cause is shown without being swallowed. + +--- + +### User Story 5 - Keep the library's catalog honest (origin maintainer) (Priority: P2) + +A library maintainer (or the library's CI) regenerates the published catalog directly from the skills in the library, so that what `search` shows is always an accurate, up-to-date reflection of what the library actually contains — never a hand-maintained list that silently drifts out of sync. + +**Why this priority**: `search` is only as trustworthy as the catalog it reads. A catalog maintained by hand (or by a fragile script that omits fields like topics) will drift from the real skills, making discovery lie. Because skillrig is the single tool for *both* consuming *and* maintaining a library, the same binary that reads the catalog must be able to produce it from one source of truth — closing the loop and guaranteeing the producer and consumer agree by construction. P2 because consumers can be demonstrated against a correct fixture catalog first, but it must ship in this slice or `search` rests on a known-drifting producer. + +**Independent Test**: In a library, run the catalog-generation command; confirm the produced catalog exactly matches what the skills on disk declare (every skill, every advertised field including topics), and that re-running it on unchanged skills produces an identical catalog (deterministic). Confirm a stale, hand-edited catalog is corrected to match the skills. + +**Acceptance Scenarios**: + +1. **Given** a library whose skills declare names, versions, descriptions, and topics, **When** the maintainer regenerates the catalog, **Then** the catalog lists exactly those skills with exactly those fields — including the topics `search --topic` filters on. +2. **Given** an unchanged set of skills, **When** the catalog is regenerated twice, **Then** the two catalogs are identical (deterministic, no spurious churn). +3. **Given** a catalog that has drifted from the skills on disk, **When** the maintainer regenerates it, **Then** the catalog is brought back into exact agreement with the skills. + +--- + +### Edge Cases + +- **Library with an empty catalog**: `search` reports "no skills published" and succeeds; `add` of any name reports "not found." +- **Origin is a local path vs a remote `OWNER/REPO`**: these are two distinct configured forms, not a precedence to resolve — the tool uses whichever the origin was configured as and reports which form it used. It never silently maintains a local copy of a remote (A1). +- **Skill listed in the catalog but its content is missing/incomplete in the library**: treated as a library-side problem and reported as such (distinct from "not found" and from "auth"), not as a silent partial vendor. +- **Topic filter matches the catalog but the chosen skill is later not fetchable** (US1 found it, US2 cannot get it): the discovery success and the acquisition failure are reported independently and honestly. +- **Catalog and the actual published skills disagree** (a skill is listed but the library has moved on, or vice-versa): the tool does not invent results; it reports what it can verify and surfaces the discrepancy. +- **Pin names a version that does not exist**: reported as an actionable "no such version," distinct from "skill not found." + +## Requirements *(mandatory)* + +### Functional Requirements + +**Discovery (`search`)** + +- **FR-001**: Users MUST be able to list the skills their bound library publishes, via a `search` command, without first obtaining a local copy of the library. +- **FR-002**: Users MUST be able to supply a free-text **query** (positional) that is matched against each skill's name and description (and topics); a skill matches when **every** query term appears. Matching MUST be deterministic and repeatable with **no** semantic, fuzzy, or learned-relevance inference (N6). When results are presented in an order, that order MUST be a deterministic function of the query and the catalog (a fixed relevance grouping, then alphabetical by name) — never a non-reproducible ranking. +- **FR-002a**: Users MUST also be able to narrow by one or more **topics** — a structured, exact-string filter distinct from the free-text query — where a skill matches only if it carries all requested topics. ("Topic" is the deliberate name for these labels; they are **not** git tags, which skillrig reserves for version pins.) +- **FR-003**: `search` MUST present results in two levels: a compact, scannable human listing with a footer hint toward the next step, and a complete machine-readable form that includes every field needed to subsequently vendor a listed skill. +- **FR-004**: Finding no matches MUST be a successful outcome that clearly says nothing matched — never an error. +- **FR-005**: Each listed skill MUST include at least its exact name (as accepted by `add`), its version, and a one-line description. + +**Remote acquisition (`add`)** + +- **FR-006**: Users MUST be able to vendor a published skill directly from the **remote** library with a single `add` command, with **no** pre-existing local copy of the library required. +- **FR-007**: Vendoring MUST place the skill's content in the repo identical to what the library holds for the acquired version. +- **FR-008**: Vendoring MUST record the skill's exact identity — its provenance (the exact source point it came from), a content fingerprint, **and** the human-readable version/tag it resolved to — such that the existing `verify` command can later confirm the on-disk content matches what was recorded, and a human can reason about which version they have (older/newer) without decoding the provenance. +- **FR-009**: Re-vendoring a skill that is already present at the same version and content MUST be a safe no-op that reports "already up to date" and changes nothing. +- **FR-010**: Re-vendoring a skill whose local content has diverged MUST refuse to silently overwrite, and MUST tell the user how to force the overwrite — consistent with the behavior when vendoring from a local copy. +- **FR-011**: Vendoring from a **local-path library** MUST continue to work unchanged (remote acquisition is additive, not a replacement). A "local library" exists **only** when the user has explicitly configured a local filesystem path as the origin; the tool MUST NOT create or maintain a local copy of a *remote* library on the user's behalf, and there is therefore no "both present" precedence to resolve — the origin is either an explicit local path or a remote `OWNER/REPO`, and the tool reports which form it used. +- **FR-012**: Requesting a skill the library does not publish MUST report "not found in the library" with guidance to discover the correct name (e.g. run `search`), and MUST be distinct from reaching/auth failures. + +**Reproducible pinning (`add`)** + +- **FR-013**: Users MUST be able to vendor a skill pinned to a specific released version, and that exact identity MUST be recorded — including both the immutable provenance the pin resolved to and the human-readable version/tag of the pin itself (per FR-008). +- **FR-014**: Vendoring the same skill at the same pin on a clean repo MUST reproduce byte-identical content with the same recorded identity. +- **FR-015**: Pinning to a version that does not exist MUST be reported as an actionable "no such version," distinct from "skill not found." + +**Trust & failure modes (both commands)** + +- **FR-016**: When the library's published format is incompatible with this tool, both `search` and `add` MUST fail clearly with a compatibility-mismatch message and a suggested remedy, rather than misbehaving or producing partial results. +- **FR-017**: When the library is private and the user is not authenticated, both commands MUST report an **authentication** failure that is distinct from "not found" and from "unreachable," and MUST point at how to authenticate. +- **FR-018**: When the library cannot be reached, both commands MUST report an unreachable-library failure distinct from authentication and compatibility failures. +- **FR-019**: Every error MUST state what failed, the real (never-swallowed) cause, and a suggested fix, with a verbose mode that reveals the underlying raw cause (errors-as-navigation). +- **FR-020**: Both commands MUST expose the project's standard output and diagnostic options (machine-readable output; verbose); `add` MUST additionally support a dry-run preview and a force override, consistent with the existing vendoring command. + +**Exit behavior** + +- **FR-021**: `search` MUST exit success on any well-formed query (including an empty result) and signal a usage/configuration problem with the standard usage/config exit status; it does not produce verification or prerequisite failures. +- **FR-022**: `add` MUST exit success on a completed vendor *and* on an idempotent no-op, and signal a usage/configuration problem (including not-found, auth, unreachable, and incompatibility, which are configuration/usage-class for this slice) with the standard usage/config exit status. Verification-failure and prerequisite-failure exit statuses remain out of scope for this slice. + +**Library catalog generation (origin-side)** + +- **FR-025**: The tool MUST be able to generate the library's published catalog from the skills in the library, so the catalog is a faithful, current reflection of those skills (US5). The generated catalog MUST carry every field `search` consumes — including the topics used for filtering. +- **FR-026**: Catalog generation MUST be deterministic: regenerating it from an unchanged set of skills MUST produce an identical catalog (no spurious changes). +- **FR-027**: The catalog MUST reflect the skills as they currently exist in the library (the current state), one entry per skill; it is **not** a version-history index. Earlier versions of a skill are reached by acquiring a specific pin (US3), not by browsing the catalog. +- **FR-028**: The generated catalog and the skills it is generated from MUST be derived from the **same** definition the tool reads when vendoring and verifying — so the producer (catalog) and the consumers (`search`, `add`, `verify`) cannot disagree about what a skill declares. + +**Co-evolution deliverables (process requirements for this branch)** + +- **FR-023**: The PoC origin template repository (the real library this feature is designed against) MUST be updated so its published catalog is produced by the tool's own generation (FR-025) rather than the hand-maintained helper that currently drifts (it omits topics), and so each skill is defined by a single `SKILL.md` (its separate sidecar manifest file is folded in — see Assumption A7). The reconciliation MUST be recorded. +- **FR-024**: The project roadmap and architecture documents MUST be updated to record the divergences this branch introduces: (a) discovery + remote acquisition + catalog generation ship as **one** combined slice; (b) the earlier local-copy seam is reframed (explicit local path vs remote `OWNER/REPO`, no tool-managed cache); (c) the skill definition moves to a single `SKILL.md`/frontmatter (A7), which entails a standard-frontmatter parsing dependency that must be reconciled against the "no new dependencies" note; (d) factual corrections surfaced by the spikes — the catalog is generated on merge to the default branch (not "on release"), and the documented private-CLI token precedence was inaccurate. + +### Key Entities *(include if feature involves data)* + +- **Library (origin)**: the org's source-of-truth repository of skills, already resolvable by the tool. It takes one of two forms: a **remote** library identified by `OWNER/REPO` with an optional branch/ref (fetched over the network), or a **local-path** library the user explicitly configured as a filesystem path. The tool never silently converts one into the other. +- **Library catalog**: the library's published, machine-readable list of available skills — each entry carrying at least name, version, description, and topics — plus the format/compatibility marker the tool checks. The basis for `search`. Discovery-only: it does not itself carry per-skill fingerprints. +- **Skill**: a named, versioned unit of agent instruction content vendored into the consumer repo. +- **Topic**: a deterministic label attached to a skill in the catalog (carried in the skill's frontmatter), used to filter discovery. Data only; no inferential matching. Named "topic" (not "tag") to avoid colliding with git tags, which skillrig uses for immutable version pins. +- **Pin**: an explicit version reference (a tag/release) used at acquisition time to make the result reproducible (distinct from the library's moving branch pointer). The pin resolves to an immutable provenance point, and both the provenance and the human-readable pin are recorded. +- **Identity record (lock entry)**: the per-skill record written at acquisition time — the resolved version/tag, source/provenance, and content fingerprint — that `verify` later checks. Extends 002's shape with the human-readable version/tag alongside the provenance; this slice can write it from a remote source. + +## Assumptions + +These are reasonable defaults adopted so the spec is internally consistent. After the 2026-05-31 clarify session, A1 is **settled** (no longer an assumption — see Clarifications) and A4 is **pending Spike S3**; the rest stand unless a spike revises them. Full detail in [spec-tech.md](./spec-tech.md). + +- **A1 — Local vs remote source (FR-006, FR-011) — SETTLED 2026-05-31**: the origin is *either* a remote `OWNER/REPO` (fetched) *or* an explicitly-configured local filesystem path; the tool never creates or caches a local copy of a remote, so there is no "both present" precedence. (Supersedes the original "prefer the local copy" wording.) +- **A2 — Discovery freshness**: `search` reflects the library's current published catalog at run time; if the library cannot be reached, `search` reports an unreachable failure rather than serving a stale result (no offline cache this slice). Confirmed: fetch the catalog per `search` call. +- **A3 — Query & topic filtering**: the free-text query is case-insensitive token-AND substring over name+description+topics; multiple `--topic` values narrow further (a skill must carry all requested topics; exact-string, case-insensitive). Result order is a fixed relevance grouping (exact-name > name-hit > topic-hit > description-only) then alphabetical by name — deterministic, no learned ranking (N6). +- **A4 — Authentication — PENDING Spike S3**: the tool reuses the user's existing standard credentials for reaching the library (the same mechanism already required to clone private org repos) via `os.exec` of `gh`/`git`; it introduces no credential of its own and stores nothing. The exact token-resolution path is the subject of Spike S3. +- **A5 — Reproducibility anchor**: the recorded content fingerprint proves the on-disk content still matches what was vendored from the library at a specific provenance point; it is not an independently library-attested hash (the library does not publish per-skill fingerprints in its catalog). +- **A6 — Identity/lock shape**: the per-skill identity record extends 002's shape with the resolved human-readable version/tag (D-pin); this slice also lets the content originate remotely. +- **A7 — Single-file skill definition (from Spike S1) — SCOPE**: each skill is defined by a single `SKILL.md` whose standard frontmatter carries its metadata (with tool-specific fields under a namespaced extension), aligning with the cross-ecosystem skill standard; the separate sidecar manifest file shipped in 002 is folded into that frontmatter as the first step of this work. Rationale, exact field mapping, and migration sizing are in [spec-tech.md](./spec-tech.md) §8b/S1. (Implies adopting a standard frontmatter parser — `gopkg.in/yaml.v3`, **accepted 2026-05-31** to align with the standard, the same parser `gh` uses; recorded as a deliberate divergence from "no new dependencies" in the FR-024 update.) +- **A8 — Catalog is generated, single-tip (from Spike S2) — SCOPE**: the catalog is produced by the tool from the library's current skills (FR-025) and reflects only the current state at the library's selected branch/ref (FR-027), regenerated wholesale; it does not aggregate history and needs no pruning. Detail in [spec-tech.md](./spec-tech.md) §8b/S2. + +## Success Criteria *(mandatory)* + +### Measurable Outcomes + +- **SC-001**: A new user can go from a freshly bound repo to a vendored, verifying skill using only `search` then `add` — with **no** manual checkout or copy of the library — in a single sitting, demonstrated end-to-end against the real PoC library. +- **SC-002**: 100% of `search` results are deterministic: identical inputs against an unchanged library produce identical output across repeated runs. +- **SC-003**: A skill vendored remotely and then checked with `verify` passes 100% of the time when untouched (the recorded identity and on-disk content agree). +- **SC-004**: The same skill vendored at the same pin on two clean repos yields byte-identical content and identical recorded identity 100% of the time. +- **SC-005**: Each of the three confusable failure classes — incompatible format, authentication, unreachable — produces a distinct, actionable message; in usability checks a reader can correctly identify which class occurred from the message alone. +- **SC-006**: Re-running `add` on an unchanged, already-vendored skill changes nothing on disk and reports an idempotent no-op 100% of the time. +- **SC-007**: Vendoring from a local copy of the library (the 002 path) continues to pass its existing acceptance scenarios unchanged (no regression). +- **SC-008**: Both consumer commands' help text alone lets an agent succeed on the first attempt: it states purpose and shows at least two runnable examples. +- **SC-009**: A catalog generated by the tool from a library's skills matches what those skills declare 100% of the time (every skill, every advertised field including topics), and regenerating it on unchanged skills is byte-identical — so `search` never shows a skill or field that the library's skills don't actually declare. + +### Previous work + +### Epic: SL-227789 — CLI Initialization & Origin Resolution (001, closed) + +- **Origin resolution & `init`**: established the single origin resolver (`env > project config > global default`), the `OWNER/REPO[@REF]` origin grammar, and the baseline CLI experience (self-documenting help, errors-as-navigation, two-level output, exit codes) that this slice inherits. + +### Feature: 002 — Vendor & Verify Skills (`add` + `verify`, merged) + +- **`skillcore` + local `add` + `verify`**: shipped the shared trust primitive (content fingerprint + manifest parse), local-copy vendoring with idempotent no-op / force-on-divergence UX, and the offline integrity gate this slice's remote acquisition writes records for and reuses. This slice extends `add` from a local copy to a remote library and adds `search`; both reuse the same shared core (no parallel implementation). + +> External references: this feature is designed against the real PoC origin template repository (`github.com/skillrig/origin-template`, checked out alongside this repo). If its published contract should be tracked as a formal dependency for reading/reference, add it with `sl deps add`. diff --git a/test/searchindex_quickstart_test.go b/test/searchindex_quickstart_test.go new file mode 100644 index 0000000..ad281b5 --- /dev/null +++ b/test/searchindex_quickstart_test.go @@ -0,0 +1,1546 @@ +// This file holds the TestQuickstart_* integration suite for feature +// 003-search-remote, the scenarios that are exercisable end-to-end against the +// real binary today: the discover (search, US1) and catalog-generation (index, +// US5) groups, plus the add --help shape (US2 SC-008). Each maps 1:1 to a +// scenario in specledger/003-search-remote/quickstart.md. +// +// Like the 001/002 suites it builds the binary once (TestMain in +// quickstart_test.go) and execs it via run(). It reuses the 002 fixture +// helpers (git, copyTree, sampleOriginDir, pinnedGitEnv, originRepo, +// decodeJSON, requireKeys, countExampleLines) and the RAW-git oracle +// discipline: every fixture is bootstrapped and every expected value computed +// with raw git, NEVER through skillcore (Constitution III / research D11). +// +// SUBSTRATE NOTE (S4 / D6). The remote-acquisition group (US2 remote add, US3 +// --pin, US4 auth/unreachable failures) runs against a real file:// bare repo +// for the CLI's origin (FIX-1 gave config.ParseOrigin a local/file:// form and +// pkg/skillcore.cloneURL a file:// seam, so `add` with no local checkout clones +// a t.TempDir bare repo over a real git transport — offline, no github.com). +// newRemoteOrigin git-inits a working tree (committed + a v-tag), clones it +// --bare, and binds SKILLRIG_ORIGIN=file://; the RAW-git oracle reads the +// expected treeSha straight from that bare repo (never skillcore, D11). +// +// Injected git failures (US4 auth/unreachable/private-not-found) are produced +// at the integration tier by a fake `git` on the binary's PATH (fakeGitBin) that +// passes every command through to the real git EXCEPT `clone`, which it fails +// with a crafted (exit 128, stderr) — the integration analog of the +// pkg/skillcore commandContext stub seam (which, being an unexported field, is +// only reachable from a pkg/skillcore unit test). The clone-phase failure trips +// the catalog gate before any subtree is fetched, so the CLI renders the +// auth/unreachable/not-found class distinctly. The typed-class assertions for +// those classes live as unit tests in pkg/skillcore (TestClassifyFetchError), +// per the quickstart's own "unit-level via the stub seam" note. +package quickstart + +import ( + "bytes" + "encoding/json" + "fmt" + "os" + "os/exec" + "path/filepath" + "strconv" + "strings" + "testing" +) + +// --------------------------------------------------------------------------- +// Search/index fixture helpers (local-path origin form — the testable path). +// --------------------------------------------------------------------------- + +// catalogSkill is one entry in a hand-authored index.json. The fields mirror +// skillcore.CatalogEntry (the search --json projection), so an entry written +// here round-trips through the binary's search reader. +type catalogSkill struct { + Name string `json:"name"` + Version string `json:"version"` + Namespace string `json:"namespace"` + Description string `json:"description"` + Topics []string `json:"topics"` + Path string `json:"path"` +} + +// catalogFile is a hand-authored index.json: the convention version search +// gates on, the origin identity, and the skills it lists. +type catalogFile struct { + SkillrigConvention int `json:"skillrigConvention"` + Origin string `json:"origin"` + Skills []catalogSkill `json:"skills"` +} + +// searchCatalog is a small, deterministic multi-skill catalog used by the +// search scenarios that need more than the single-skill fixture: ordering, +// token-AND query, and topic filtering. The names are chosen so the relevance +// buckets are unambiguous (an exact-name and a name-substring hit for the +// "terraform" query, plus a description-only and an unrelated skill). +func searchCatalog() catalogFile { + return catalogFile{ + SkillrigConvention: 1, + Origin: originRepo, + Skills: []catalogSkill{ + { + Name: "terraform-plan-review", + Version: "1.4.0", + Namespace: "my-org", + Description: "Review a terraform plan for risk before apply.", + Topics: []string{"platform-team", "terraform", "aws"}, + Path: "skills/terraform-plan-review", + }, + { + Name: "terraform-module-lint", + Version: "0.9.0", + Namespace: "my-org", + Description: "Lint a terraform module for style and structure.", + Topics: []string{"terraform"}, + Path: "skills/terraform-module-lint", + }, + { + Name: "aws-iam-audit", + Version: "2.0.0", + Namespace: "my-org", + Description: "Audit a terraform-managed AWS IAM policy set for drift.", + Topics: []string{"security", "aws"}, + Path: "skills/aws-iam-audit", + }, + { + Name: "k8s-manifest-check", + Version: "1.0.0", + Namespace: "my-org", + Description: "Validate kubernetes manifests before rollout.", + Topics: []string{"kubernetes"}, + Path: "skills/k8s-manifest-check", + }, + }, + } +} + +// searchConsumer is a consumer repo whose origin (a local checkout at +// /my-org/my-skills) ships only an index.json — search reads the catalog +// straight off disk (it does not need the origin to be a committed git repo, +// only the consumer to be one), so a catalog fixture is all the substrate the +// search scenarios require. +type searchConsumer struct { + root string +} + +// newSearchConsumer builds a git consumer repo, writes the given catalog as the +// origin's index.json at /my-org/my-skills/index.json, and binds the +// origin via SKILLRIG_ORIGIN at call sites (search resolves it like every +// command). The origin checkout is kept out of the consumer index so it never +// pollutes the working tree the way 002 arranges it. +func newSearchConsumer(t *testing.T, cat catalogFile) searchConsumer { + t.Helper() + requireGit(t) + + root := t.TempDir() + git(t, root, "init", "-q", "-b", "main") + + originDir := filepath.Join(root, filepath.FromSlash(originRepo)) + if err := os.MkdirAll(originDir, 0o755); err != nil { + t.Fatalf("mkdir origin %s: %v", originDir, err) + } + + data, err := json.MarshalIndent(cat, "", " ") + if err != nil { + t.Fatalf("marshal catalog: %v", err) + } + + if err := os.WriteFile(filepath.Join(originDir, "index.json"), append(data, '\n'), 0o644); err != nil { + t.Fatalf("write index.json: %v", err) + } + + return searchConsumer{root: root} +} + +// search runs `skillrig search args...` in the consumer with the origin bound +// via SKILLRIG_ORIGIN (the env precedence the resolver honors). +func (c searchConsumer) search(t *testing.T, args ...string) runResult { + t.Helper() + + return run(t, runOpts{ + args: append([]string{"search"}, args...), + cwd: c.root, + env: map[string]string{"SKILLRIG_ORIGIN": originRepo}, + }) +} + +// originRepoBootstrap bootstraps a committed local origin git repo from a +// source tree (the committed fixture, or a multi-skill tree built in a +// t.TempDir) so `index` — which runs INSIDE the origin repo and finds its root +// via git — has a real work tree. It returns the origin repo root. +func bootstrapOriginRepo(t *testing.T, src string) string { + t.Helper() + requireGit(t) + + dir := t.TempDir() + copyTree(t, src, dir) + + git(t, dir, "init", "-q", "-b", "main") + git(t, dir, "add", "-A") + git(t, dir, "commit", "-q", "-m", "origin fixture") + + return dir +} + +// indexIn runs `skillrig index args...` with cwd inside the origin repo (index +// is an origin-side generator: it locates the origin root from the cwd). +func indexIn(t *testing.T, originRoot string, args ...string) runResult { + t.Helper() + + return run(t, runOpts{args: append([]string{"index"}, args...), cwd: originRoot}) +} + +// writeSkillMD writes a SKILL.md with the given frontmatter under +// originRoot/skills//. raw is the full frontmatter body (between the +// fences) so a scenario can author a malformed or version-less manifest. +func writeSkillMD(t *testing.T, originRoot, name, frontmatter, body string) { + t.Helper() + + dir := filepath.Join(originRoot, "skills", name) + if err := os.MkdirAll(dir, 0o755); err != nil { + t.Fatalf("mkdir skill %s: %v", name, err) + } + + content := "---\n" + frontmatter + "\n---\n\n" + body + "\n" + if err := os.WriteFile(filepath.Join(dir, "SKILL.md"), []byte(content), 0o644); err != nil { + t.Fatalf("write SKILL.md for %s: %v", name, err) + } +} + +// writeOriginConfig writes a minimal .skillrig-origin.toml (convention 1) at +// originRoot so index can read the convention version and skills dir. +func writeOriginConfig(t *testing.T, originRoot string) { + t.Helper() + + cfg := "convention_version = 1\norigin = \"" + originRepo + "\"\nskills_dir = \"skills\"\n" + if err := os.WriteFile(filepath.Join(originRoot, ".skillrig-origin.toml"), []byte(cfg), 0o644); err != nil { + t.Fatalf("write origin config: %v", err) + } +} + +// searchEntry mirrors one search --json skill entry for completeness assertions. +type searchEntry struct { + Name string `json:"name"` + Version string `json:"version"` + Namespace string `json:"namespace"` + Description string `json:"description"` + Topics []string `json:"topics"` + Path string `json:"path"` +} + +// searchPayload mirrors the search --json top-level object. +type searchPayload struct { + Origin string `json:"origin"` + Skills []searchEntry `json:"skills"` +} + +// decodeSearch strictly decodes a search --json payload. +func decodeSearch(t *testing.T, stdout string) searchPayload { + t.Helper() + + var p searchPayload + if err := json.Unmarshal([]byte(stdout), &p); err != nil { + t.Fatalf("search --json is not parseable: %v\n%s", err, stdout) + } + + return p +} + +// searchNames returns the matched skill names in result order (search --json +// preserves the binary's relevance+name ordering). +func searchNames(p searchPayload) []string { + out := make([]string, len(p.Skills)) + for i, s := range p.Skills { + out[i] = s.Name + } + + return out +} + +// --------------------------------------------------------------------------- +// US1 — Discover (search) +// --------------------------------------------------------------------------- + +// TestQuickstart_SearchListsSkills — search with no query lists every skill the +// origin publishes (name/version/desc) + a footer hint; bounded human shape. +func TestQuickstart_SearchListsSkills(t *testing.T) { + t.Parallel() + + c := newSearchConsumer(t, searchCatalog()) + + res := c.search(t) + if res.exit != 0 { + t.Fatalf("search exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + matches := len(searchCatalog().Skills) + + lines := nonEmptyLines(res.stdout) + if len(lines) > matches+5 { + t.Errorf("human stdout has %d lines, want <= matches+5 (%d):\n%s", len(lines), matches+5, res.stdout) + } + + for _, name := range []string{"terraform-plan-review", "aws-iam-audit", "k8s-manifest-check"} { + if !strings.Contains(res.stdout, name) { + t.Errorf("listing omits %q:\n%s", name, res.stdout) + } + } + + if !strings.Contains(res.stdout, "skillrig add") { + t.Errorf("listing missing the add footer hint:\n%s", res.stdout) + } +} + +// TestQuickstart_SearchQueryMatchesNameDesc — `search terraform plan` keeps only +// skills whose name+description+topics contain BOTH terms (token-AND); a skill +// matching one term but not the other is excluded (FR-002). +func TestQuickstart_SearchQueryMatchesNameDesc(t *testing.T) { + t.Parallel() + + c := newSearchConsumer(t, searchCatalog()) + + res := c.search(t, "--json", "terraform", "plan") + if res.exit != 0 { + t.Fatalf("search exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + names := searchNames(decodeSearch(t, res.stdout)) + + // Only terraform-plan-review carries both "terraform" AND "plan". The other + // terraform skills lack "plan"; aws-iam-audit has "terraform" (description) + // but not "plan"; k8s has neither. + if len(names) != 1 || names[0] != "terraform-plan-review" { + t.Errorf("token-AND query 'terraform plan' = %v, want exactly [terraform-plan-review]", names) + } +} + +// TestQuickstart_SearchOrderingDeterministic — a query hitting several skills is +// ordered by the fixed relevance bucket then name, and is byte-identical across +// two runs (D8/N6, SC-002). +func TestQuickstart_SearchOrderingDeterministic(t *testing.T) { + t.Parallel() + + c := newSearchConsumer(t, searchCatalog()) + + first := c.search(t, "--json", "terraform") + second := c.search(t, "--json", "terraform") + + if first.exit != 0 || second.exit != 0 { + t.Fatalf("search exits = %d/%d, want 0/0 (stderr: %s)", first.exit, second.exit, first.stderr) + } + + if first.stdout != second.stdout { + t.Errorf("search ordering not byte-identical across runs:\nA=%s\nB=%s", first.stdout, second.stdout) + } + + names := searchNames(decodeSearch(t, first.stdout)) + + // "terraform" hits all three terraform-ish skills. Bucket order: the two + // name-substring hits (terraform-module-lint, terraform-plan-review) outrank + // the description-only hit (aws-iam-audit); within the name bucket, ties + // break lexicographically by name (module-lint < plan-review). k8s does not + // match at all. + want := []string{"terraform-module-lint", "terraform-plan-review", "aws-iam-audit"} + if strings.Join(names, ",") != strings.Join(want, ",") { + t.Errorf("ordering = %v, want %v (relevance bucket then name)", names, want) + } +} + +// TestQuickstart_SearchFilterByTopic — `search --topic aws` lists only aws-topic +// skills and is identical across two runs. +func TestQuickstart_SearchFilterByTopic(t *testing.T) { + t.Parallel() + + c := newSearchConsumer(t, searchCatalog()) + + first := c.search(t, "--json", "--topic", "aws") + second := c.search(t, "--json", "--topic", "aws") + + if first.exit != 0 { + t.Fatalf("search exit = %d, want 0 (stderr: %s)", first.exit, first.stderr) + } + + if first.stdout != second.stdout { + t.Errorf("topic filter not deterministic:\nA=%s\nB=%s", first.stdout, second.stdout) + } + + names := searchNames(decodeSearch(t, first.stdout)) + + // Exactly the two aws-topic skills (ordered by name: aws-iam-audit then + // terraform-plan-review); the two non-aws skills are excluded. + want := []string{"aws-iam-audit", "terraform-plan-review"} + if strings.Join(names, ",") != strings.Join(want, ",") { + t.Errorf("--topic aws = %v, want %v", names, want) + } +} + +// TestQuickstart_SearchEmptyResult — `search --topic nonesuch` reports no match +// and is still success (exit 0, FR-004). +func TestQuickstart_SearchEmptyResult(t *testing.T) { + t.Parallel() + + c := newSearchConsumer(t, searchCatalog()) + + res := c.search(t, "--topic", "nonesuch") + if res.exit != 0 { + t.Fatalf("empty-result search exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + if !strings.Contains(res.stdout, "no skills matched") { + t.Errorf("human output = %q, want it to say 'no skills matched'", res.stdout) + } + + // --json variant: an empty result is the [] skills array, not null. + jsonRes := c.search(t, "--json", "--topic", "nonesuch") + if jsonRes.exit != 0 { + t.Fatalf("empty-result search --json exit = %d, want 0", jsonRes.exit) + } + + p := decodeSearch(t, jsonRes.stdout) + if len(p.Skills) != 0 { + t.Errorf("empty result --json skills = %v, want []", p.Skills) + } + + if !strings.Contains(jsonRes.stdout, "\"skills\":[]") { + t.Errorf("empty result should serialize skills as [], got:\n%s", jsonRes.stdout) + } +} + +// TestQuickstart_SearchJSONComplete — --json parses and every entry carries the +// full field set add needs (field-presence, not truncation). +func TestQuickstart_SearchJSONComplete(t *testing.T) { + t.Parallel() + + c := newSearchConsumer(t, searchCatalog()) + + res := c.search(t, "--json") + if res.exit != 0 { + t.Fatalf("search --json exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + // Top-level structural completeness. + obj := decodeJSON(t, res.stdout) + requireKeys(t, obj, "origin", "skills") + + rawSkills, ok := obj["skills"].([]any) + if !ok { + t.Fatalf("skills is not an array: %v", obj["skills"]) + } + + if len(rawSkills) == 0 { + t.Fatal("expected at least one skill to assert per-entry completeness") + } + + // Every entry carries name/version/namespace/description/topics/path. + for i, raw := range rawSkills { + entry, ok := raw.(map[string]any) + if !ok { + t.Fatalf("skills[%d] is not an object: %v", i, raw) + } + + requireKeys(t, entry, "name", "version", "namespace", "description", "topics", "path") + } +} + +// TestQuickstart_SearchConventionMismatch — a catalog declaring +// skillrigConvention 2 fails with exit 1 and a 3-part compatibility message +// ("update skillrig"). +func TestQuickstart_SearchConventionMismatch(t *testing.T) { + t.Parallel() + + cat := searchCatalog() + cat.SkillrigConvention = 2 + c := newSearchConsumer(t, cat) + + res := c.search(t) + if res.exit != 1 { + t.Fatalf("convention-2 search exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + // Three distinct parts: what (a version mismatch), why, fix (update skillrig). + assertContains(t, "what", res.stderr, "convention") + assertContains(t, "why", res.stderr, "why:") + assertContains(t, "fix", res.stderr, "update skillrig") +} + +// TestQuickstart_SearchConventionBoundary (C1) — the exact-match gate: a catalog +// declaring convention 0 AND one omitting the field each fail (a lower/missing +// convention does NOT silently pass), while convention 1 passes. Pins the +// non-">" boundary so FR-016/SC-005 is unambiguous. +func TestQuickstart_SearchConventionBoundary(t *testing.T) { + t.Parallel() + + // convention == 1 passes. + pass := newSearchConsumer(t, searchCatalog()) + if res := pass.search(t); res.exit != 0 { + t.Fatalf("convention-1 search exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + // convention == 0 (explicit) fails exit 1. + zeroCat := searchCatalog() + zeroCat.SkillrigConvention = 0 + zero := newSearchConsumer(t, zeroCat) + + if res := zero.search(t); res.exit != 1 { + t.Errorf("convention-0 search exit = %d, want 1 (a lower/zero convention must not pass)", res.exit) + } + + // convention absent (field omitted → JSON default 0) also fails exit 1. Write + // a catalog object WITHOUT the skillrigConvention key. + c := newSearchConsumer(t, searchCatalog()) + absent := map[string]any{"origin": originRepo, "skills": []any{}} + + data, err := json.MarshalIndent(absent, "", " ") + if err != nil { + t.Fatalf("marshal absent-convention catalog: %v", err) + } + + path := filepath.Join(c.root, filepath.FromSlash(originRepo), "index.json") + if err := os.WriteFile(path, append(data, '\n'), 0o644); err != nil { + t.Fatalf("rewrite catalog without convention: %v", err) + } + + if res := c.search(t); res.exit != 1 { + t.Errorf("absent-convention search exit = %d, want 1 (an absent convention must not pass)", res.exit) + } +} + +// TestQuickstart_SearchHelpExamples — search --help shows the purpose line + >=2 +// runnable examples (bounded shape). +func TestQuickstart_SearchHelpExamples(t *testing.T) { + t.Parallel() + + res := run(t, runOpts{args: []string{"search", "--help"}}) + if res.exit != 0 { + t.Fatalf("search --help exit = %d, want 0", res.exit) + } + + if n := countExampleLines(res.stdout, "skillrig search"); n < 2 { + t.Errorf("search --help shows %d 'skillrig search' example lines, want >= 2:\n%s", n, res.stdout) + } +} + +// --------------------------------------------------------------------------- +// US2 — add --help (the testable slice of the remote-acquisition group) +// --------------------------------------------------------------------------- + +// TestQuickstart_AddHelpShowsPinExample (C5/SC-008) — add --help shows >=2 +// runnable examples, one of which is the --pin form. (The base add --help shape +// is asserted by the 002 suite's TestQuickstart_AddHelpExamples; this pins the +// 003-specific --pin example requirement.) +func TestQuickstart_AddHelpShowsPinExample(t *testing.T) { + t.Parallel() + + res := run(t, runOpts{args: []string{"add", "--help"}}) + if res.exit != 0 { + t.Fatalf("add --help exit = %d, want 0", res.exit) + } + + if n := countExampleLines(res.stdout, "skillrig add"); n < 2 { + t.Errorf("add --help shows %d 'skillrig add' example lines, want >= 2:\n%s", n, res.stdout) + } + + if !strings.Contains(res.stdout, "--pin") { + t.Errorf("add --help must document a --pin example (SC-008), got:\n%s", res.stdout) + } +} + +// --------------------------------------------------------------------------- +// US5 — Catalog generation (index) +// --------------------------------------------------------------------------- + +// committedIndex reads the committed fixture index.json (the producer==artifact +// oracle for IndexMatchesCommitted). +func committedIndex(t *testing.T) string { + t.Helper() + + return readFile(t, filepath.Join("testdata", "sample-origin", "index.json")) +} + +// TestQuickstart_IndexGenerates — `index` over the origin fixture writes an +// index.json whose entries match the skills' frontmatter, INCLUDING topics. +func TestQuickstart_IndexGenerates(t *testing.T) { + t.Parallel() + + originRoot := bootstrapOriginRepo(t, sampleOriginDir(t)) + + res := indexIn(t, originRoot) + if res.exit != 0 { + t.Fatalf("index exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + if !strings.Contains(res.stdout, "indexed 1 skill") { + t.Errorf("index human output = %q, want it to report 1 indexed skill", res.stdout) + } + + var cat catalogFile + if err := json.Unmarshal([]byte(readFile(t, filepath.Join(originRoot, "index.json"))), &cat); err != nil { + t.Fatalf("generated index.json not parseable: %v", err) + } + + if cat.SkillrigConvention != 1 { + t.Errorf("generated convention = %d, want 1 (read from .skillrig-origin.toml)", cat.SkillrigConvention) + } + + if len(cat.Skills) != 1 { + t.Fatalf("generated skills = %d, want 1", len(cat.Skills)) + } + + entry := cat.Skills[0] + if entry.Name != sampleSkill || entry.Version != sampleVersion { + t.Errorf("entry name/version = %q/%q, want %q/%q", entry.Name, entry.Version, sampleSkill, sampleVersion) + } + + // Topics are the field the old build-index.sh dropped — assert they survive. + wantTopics := []string{"platform-team", "terraform", "aws"} + if strings.Join(entry.Topics, ",") != strings.Join(wantTopics, ",") { + t.Errorf("entry topics = %v, want %v (topics must be carried)", entry.Topics, wantTopics) + } + + // --json summary is structurally complete. + jsonRes := indexIn(t, originRoot, "--json") + obj := decodeJSON(t, jsonRes.stdout) + requireKeys(t, obj, "out", "skills", "convention") +} + +// TestQuickstart_IndexDeterministic — running index twice over an unchanged +// skill set yields byte-identical output (SC-009). +func TestQuickstart_IndexDeterministic(t *testing.T) { + t.Parallel() + + originRoot := bootstrapOriginRepo(t, sampleOriginDir(t)) + + if res := indexIn(t, originRoot); res.exit != 0 { + t.Fatalf("first index exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + firstBytes := readFile(t, filepath.Join(originRoot, "index.json")) + + if res := indexIn(t, originRoot); res.exit != 0 { + t.Fatalf("second index exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + secondBytes := readFile(t, filepath.Join(originRoot, "index.json")) + if firstBytes != secondBytes { + t.Errorf("index is not deterministic:\nfirst=%s\nsecond=%s", firstBytes, secondBytes) + } +} + +// TestQuickstart_IndexMatchesCommitted — `index` output equals the committed +// PoC index.json (producer == artifact oracle). The committed fixture +// index.json is the ground truth; regenerating it MUST reproduce it byte for +// byte. +// +// De-circularization (FIX-5 / M1): byte-equality alone cannot catch the +// PascalCase-requires-keys bug, because the producer and the committed fixture +// were generated the same (formerly buggy) way — they would agree on "Tool" +// just as readily as on "tool". So this also decodes the committed fixture's +// requires through a struct with `json:"tool"` tags and asserts the field is +// populated: that fails iff the JSON emits "Tool"/PascalCase (data-model §2), +// pinning the lowercase-key contract independently of the producer==artifact +// comparison. +func TestQuickstart_IndexMatchesCommitted(t *testing.T) { + t.Parallel() + + originRoot := bootstrapOriginRepo(t, sampleOriginDir(t)) + + if res := indexIn(t, originRoot); res.exit != 0 { + t.Fatalf("index exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + got := readFile(t, filepath.Join(originRoot, "index.json")) + want := committedIndex(t) + + if got != want { + t.Errorf("regenerated index.json != committed fixture (producer/artifact drift):\ngot=%s\nwant=%s", got, want) + } + + assertRequiresKeysLowercase(t, []byte(want)) + assertVersionConstraintsUnescaped(t, []byte(want)) +} + +// assertVersionConstraintsUnescaped asserts the catalog's raw bytes carry the +// readable ">=" version constraint, NOT Go's default HTML-escaped ">=" +// (FIX-5/M1). The catalog MUST be marshaled with SetEscapeHTML(false); this +// raw-byte check is the other half of de-circularizing the producer==artifact +// oracle so the escaping regression cannot hide behind a matching fixture. +func assertVersionConstraintsUnescaped(t *testing.T, indexJSON []byte) { + t.Helper() + + // "\\u003e" is the 6 ASCII bytes backslash-u-0-0-3-e — Go's HTML-escaped + // form of '>'. An interpreted literal (not a raw `...`) is required so the + // sequence is those bytes, not a literal '>' rune. + if bytes.Contains(indexJSON, []byte("\\u003e")) { + t.Errorf("index.json HTML-escapes version constraints (\\u003e) — marshal the catalog "+ + "with SetEscapeHTML(false) so \">=\" stays readable; got:\n%s", indexJSON) + } + + if !bytes.Contains(indexJSON, []byte(">=")) { + t.Errorf("index.json has no readable \">=\" constraint — the unescaped-constraint "+ + "assertion needs at least one to be meaningful; got:\n%s", indexJSON) + } +} + +// requiresProbe decodes just the requires list of the first catalog skill via +// lowercase `json` tags. If the catalog emitted PascalCase keys ("Tool"), Tool +// would unmarshal to "" — the discriminator the lowercase-key assertion checks. +type requiresProbe struct { + Skills []struct { + Requires []struct { + Tool string `json:"tool"` + Version string `json:"version"` + } `json:"requires"` + } `json:"skills"` +} + +// assertRequiresKeysLowercase decodes the catalog and asserts every requires +// entry's keys are lowercase (data-model §2): a non-empty .tool proves the JSON +// used "tool", not "Tool". It breaks the circular producer==artifact oracle in +// IndexMatchesCommitted so the PascalCase-requires bug (FIX-5/M1) cannot hide. +func assertRequiresKeysLowercase(t *testing.T, indexJSON []byte) { + t.Helper() + + var probe requiresProbe + if err := json.Unmarshal(indexJSON, &probe); err != nil { + t.Fatalf("index.json is not parseable: %v\n%s", err, indexJSON) + } + + if len(probe.Skills) == 0 { + t.Fatal("index.json has no skills to assert requires-key casing on") + } + + sawRequire := false + + for i, s := range probe.Skills { + for j, r := range s.Requires { + sawRequire = true + + if r.Tool == "" { + t.Errorf("skills[%d].requires[%d].tool is empty — requires keys must be lowercase "+ + "\"tool\"/\"version\" (data-model §2), not PascalCase; got:\n%s", i, j, indexJSON) + } + } + } + + if !sawRequire { + t.Fatal("no requires entries found — the lowercase-key assertion needs at least one to be meaningful") + } +} + +// TestQuickstart_IndexMalformedFrontmatter — a skill with broken frontmatter +// fails index with exit 1 naming the offending SKILL.md. +func TestQuickstart_IndexMalformedFrontmatter(t *testing.T) { + t.Parallel() + requireGit(t) + + originRoot := t.TempDir() + writeOriginConfig(t, originRoot) + // Unterminated YAML flow sequence — a parse error, not a schema error. + writeSkillMD(t, originRoot, "broken", "name: [unterminated", "# Broken") + + git(t, originRoot, "init", "-q", "-b", "main") + git(t, originRoot, "add", "-A") + git(t, originRoot, "commit", "-q", "-m", "broken skill") + + res := indexIn(t, originRoot) + if res.exit != 1 { + t.Fatalf("malformed-frontmatter index exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + // The error must name the offending file so the maintainer can fix it. + if !strings.Contains(res.stderr, filepath.Join("skills", "broken", "SKILL.md")) { + t.Errorf("error must name the offending SKILL.md, got:\n%s", res.stderr) + } +} + +// TestQuickstart_IndexNotInOrigin (C8) — running index outside an origin repo +// (a git repo with no .skillrig-origin.toml) fails exit 1 with the what/why/fix +// "run inside the origin repo" navigation. +func TestQuickstart_IndexNotInOrigin(t *testing.T) { + t.Parallel() + requireGit(t) + + // A git repo, but with no .skillrig-origin.toml → not an origin repo. + root := t.TempDir() + git(t, root, "init", "-q", "-b", "main") + + res := indexIn(t, root) + if res.exit != 1 { + t.Fatalf("index outside an origin exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + assertContains(t, "what", res.stderr, "not in an origin repository") + assertContains(t, "why", res.stderr, "why:") + assertContains(t, "fix", res.stderr, "inside the origin repo") +} + +// TestQuickstart_IndexMissingVersion (C9) — a skill whose frontmatter omits the +// required x-skillrig.version fails index with exit 1 naming the offending +// SKILL.md (the catalog-entry validation rule; guards the seed-enrichment +// precondition of IndexMatchesCommitted). +func TestQuickstart_IndexMissingVersion(t *testing.T) { + t.Parallel() + requireGit(t) + + originRoot := t.TempDir() + writeOriginConfig(t, originRoot) + + // Valid YAML, name matches the directory, but no x-skillrig.version. + fm := "name: novers\n" + + "description: a skill missing its version\n" + + "metadata:\n" + + " x-skillrig.namespace: my-org" + writeSkillMD(t, originRoot, "novers", fm, "# Novers") + + git(t, originRoot, "init", "-q", "-b", "main") + git(t, originRoot, "add", "-A") + git(t, originRoot, "commit", "-q", "-m", "versionless skill") + + res := indexIn(t, originRoot) + if res.exit != 1 { + t.Fatalf("versionless-skill index exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + if !strings.Contains(res.stderr, filepath.Join("skills", "novers", "SKILL.md")) { + t.Errorf("error must name the offending SKILL.md, got:\n%s", res.stderr) + } + + // And it must point at the missing version specifically (errors-as-navigation). + if !strings.Contains(res.stderr, "x-skillrig.version") { + t.Errorf("error must cite the missing x-skillrig.version, got:\n%s", res.stderr) + } +} + +// --------------------------------------------------------------------------- +// US2/US3/US4 — remote acquisition, --pin, injected failures (file:// substrate) +// +// The CLI's origin is a real file:// bare repo built in t.TempDir(); add with no +// local checkout clones it over a real git transport (FIX-1's file:// seam), +// offline. Every expected treeSha is the RAW-git oracle read straight from that +// bare repo (rawTreeSHA → `git rev-parse :`), NEVER skillcore (D11). +// --------------------------------------------------------------------------- + +// pinTag is the immutable release tag the remote-origin fixture publishes for +// the sample skill: the full-tag form of the bare semver sampleVersion under the +// origin's name-vSEMVER scheme. `--pin v1.4.0` expands to exactly this. +const pinTag = sampleSkill + "-v" + sampleVersion + +// remoteOrigin is a file:// bare-repo origin: a committed working tree (with the +// sample skill + an index.json + a release tag) pushed into a bare repo. The CLI +// is pointed at it via SKILLRIG_ORIGIN=file://, so the remote-fetch path +// clones it without any local checkout. +type remoteOrigin struct { + // bareDir is the bare git repo the CLI clones from (the file:// target). + bareDir string + // cloneURL is file://, the SKILLRIG_ORIGIN value the CLI resolves. + cloneURL string +} + +// newRemoteOrigin builds the file:// bare-repo substrate from the committed +// sample-origin fixture: it copies the fixture into a work tree, writes its +// index.json (so the convention gate the remote add runs sees skillrigConvention +// 1), commits with the pinned identity, tags the release (pinTag), then clones +// it --bare. The bare repo's default branch is main so an unpinned add resolves +// the origin @ref (HEAD/main); the tag makes a pinned add reproducible. +func newRemoteOrigin(t *testing.T) remoteOrigin { + t.Helper() + requireGit(t) + + work := t.TempDir() + copyTree(t, sampleOriginDir(t), work) + + // The committed fixture already ships an index.json; copyTree carried it, so + // the convention gate reads skillrigConvention 1 straight from the fixture. + git(t, work, "init", "-q", "-b", "main") + git(t, work, "add", "-A") + git(t, work, "commit", "-q", "-m", "origin fixture") + git(t, work, "tag", pinTag) + + bareDir := filepath.Join(t.TempDir(), "origin.git") + git(t, work, "clone", "-q", "--bare", work, bareDir) + + return remoteOrigin{bareDir: bareDir, cloneURL: "file://" + bareDir} +} + +// newRemoteOriginConvention mirrors newRemoteOrigin but rewrites the work-tree +// index.json so its declared skillrigConvention is `conv` instead of the +// fixture's 1, before committing and cloning --bare. It is a localized byte +// rewrite of the single convention token (not a full JSON re-encode), so the rest +// of the catalog — including the `requires` blocks catalogFile does not model — +// survives untouched. It is the substrate for the remote convention-gate test. +func newRemoteOriginConvention(t *testing.T, conv int) remoteOrigin { + t.Helper() + requireGit(t) + + work := t.TempDir() + copyTree(t, sampleOriginDir(t), work) + + // Localized rewrite: flip the declared convention from the fixture's 1 to conv. + indexPath := filepath.Join(work, "index.json") + before := readFile(t, indexPath) + after := strings.Replace(before, `"skillrigConvention": 1`, fmt.Sprintf(`"skillrigConvention": %d`, conv), 1) + + if after == before { + t.Fatalf("index.json did not contain the expected convention token to rewrite:\n%s", before) + } + + if err := os.WriteFile(indexPath, []byte(after), 0o644); err != nil { + t.Fatalf("rewrite index.json convention: %v", err) + } + + git(t, work, "init", "-q", "-b", "main") + git(t, work, "add", "-A") + git(t, work, "commit", "-q", "-m", "origin fixture (convention "+strconv.Itoa(conv)+")") + git(t, work, "tag", pinTag) + + bareDir := filepath.Join(t.TempDir(), "origin.git") + git(t, work, "clone", "-q", "--bare", work, bareDir) + + return remoteOrigin{bareDir: bareDir, cloneURL: "file://" + bareDir} +} + +// rawTree returns the RAW-git tree-SHA of the sample skill subtree at ref in the +// bare origin (the independent oracle, D11). The bare repo carries the full +// history, so `git rev-parse :` resolves the same tree object the CLI +// will fetch and fingerprint. +func (o remoteOrigin) rawTree(t *testing.T, ref string) string { + t.Helper() + + return rawTreeSHA(t, o.bareDir, ref, originSubtree) +} + +// remoteConsumer is a fresh git repo (no origin checkout) that vendors from a +// file:// origin via SKILLRIG_ORIGIN. +type remoteConsumer struct { + root string + cloneURL string +} + +// newRemoteConsumer git-inits a consumer repo bound (via env at call sites) to +// the remote origin. There is NO local OWNER/REPO checkout under it, so add must +// take the remote-fetch path against the file:// origin. +func newRemoteConsumer(t *testing.T, o remoteOrigin) remoteConsumer { + t.Helper() + requireGit(t) + + root := t.TempDir() + git(t, root, "init", "-q", "-b", "main") + + return remoteConsumer{root: root, cloneURL: o.cloneURL} +} + +// add runs `skillrig add args...` in the consumer with the origin bound via +// SKILLRIG_ORIGIN=file:// (the env precedence the resolver honors). +func (c remoteConsumer) add(t *testing.T, args ...string) runResult { + t.Helper() + + return run(t, runOpts{ + args: append([]string{"add"}, args...), + cwd: c.root, + env: map[string]string{"SKILLRIG_ORIGIN": c.cloneURL}, + }) +} + +// verify runs `skillrig verify` in the consumer. verify reads the lock + git and +// needs no origin, so no SKILLRIG_ORIGIN is bound — proving the vendored result +// stands on its own after a remote add. +func (c remoteConsumer) verify(t *testing.T, args ...string) runResult { + t.Helper() + + return run(t, runOpts{args: append([]string{"verify"}, args...), cwd: c.root}) +} + +// fakeGitBin writes a `git` shim into a fresh dir and returns the dir, for +// prepending to the binary's PATH. The shim passes EVERY git invocation through +// to the real git EXCEPT `clone`, which it fails with the crafted (exit 128, +// stderr) — the integration analog of pkg/skillcore's commandContext stub seam. +// gitToplevel (rev-parse) still succeeds, so the failure surfaces precisely at +// the remote fetch's clone phase (the catalog gate), letting the CLI render the +// auth/unreachable/not-found class distinctly. +func fakeGitBin(t *testing.T, stderr string) string { + t.Helper() + + realGit, err := exec.LookPath("git") + if err != nil { + t.Skip("git not on PATH; skipping injected-failure scenario") + } + + dir := t.TempDir() + + // For `clone` (the first positional arg), emit the crafted stderr and exit + // 128; otherwise exec the real git with all args. The crafted stderr is + // single-quoted in the heredoc-free script, so it must contain no single quote. + script := "#!/bin/sh\n" + + "if [ \"$1\" = clone ]; then\n" + + " printf '%s\\n' " + shellQuote(stderr) + " 1>&2\n" + + " exit 128\n" + + "fi\n" + + "exec " + shellQuote(realGit) + " \"$@\"\n" + + if err := os.WriteFile(filepath.Join(dir, "git"), []byte(script), 0o755); err != nil { + t.Fatalf("write fake git: %v", err) + } + + return dir +} + +// shellQuote single-quotes s for POSIX sh, escaping embedded single quotes. +func shellQuote(s string) string { + return "'" + strings.ReplaceAll(s, "'", `'\''`) + "'" +} + +// addWithFakeGit runs `skillrig add ` against the remote origin with the +// fake git (failing clones with stderr) prepended to the binary's PATH. The PATH +// override is placed in opts.env, which os/exec dedupes keeping the last value, +// so the shim shadows the real git for the child process only. +func (c remoteConsumer) addWithFakeGit(t *testing.T, stderr string, args ...string) runResult { + t.Helper() + + binDir := fakeGitBin(t, stderr) + + return run(t, runOpts{ + args: append([]string{"add"}, args...), + cwd: c.root, + env: map[string]string{ + "SKILLRIG_ORIGIN": c.cloneURL, + "PATH": binDir + string(os.PathListSeparator) + os.Getenv("PATH"), + }, + }) +} + +// TestQuickstart_AddRemoteNoLocalCopy (US2) — given a file:// origin and NO local +// checkout, add vendors the subtree byte-identical to the origin, records a lock +// entry whose treeSha == the RAW-git ground truth, and a subsequent verify (no +// origin needed) exits 0. +func TestQuickstart_AddRemoteNoLocalCopy(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + wantTree := o.rawTree(t, "HEAD") + + res := c.add(t, sampleSkill) + if res.exit != 0 { + t.Fatalf("remote add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + // Human shape: bounded (≤ 2 lines) with the verify footer hint. + lines := nonEmptyLines(res.stdout) + if len(lines) > 2 { + t.Errorf("human stdout has %d lines, want <= 2:\n%s", len(lines), res.stdout) + } + + if !strings.Contains(res.stdout, "skillrig verify") { + t.Errorf("remote add missing the verify footer hint:\n%s", res.stdout) + } + + // Vendored byte-identical to the origin fixture, including the exec bit. + assertVendoredMatchesFixture(t, c.root) + + // Lock: treeSha == the raw-git ground truth; version is the manifest version. + entry := lockEntry(t, c.root, sampleSkill) + if entry["treeSha"] != wantTree { + t.Errorf("lock treeSha = %v, want raw-git ground truth %s", entry["treeSha"], wantTree) + } + + if entry["version"] != sampleVersion { + t.Errorf("lock version = %v, want %s", entry["version"], sampleVersion) + } + + if entry["path"] != vendoredPath { + t.Errorf("lock path = %v, want %s", entry["path"], vendoredPath) + } + + if commit, _ := entry["commit"].(string); len(commit) != 40 { + t.Errorf("lock commit = %q, want a 40-hex commit SHA", commit) + } + + // Commit the vendored result, then verify must exit 0 (the round-trip). + commitAll(t, c.root, "vendor remote skill") + + if v := c.verify(t); v.exit != 0 { + t.Fatalf("verify after remote add exit = %d, want 0 (stderr: %s)", v.exit, v.stderr) + } +} + +// TestQuickstart_AddRemoteIdempotent (US2, SC-006) — re-running a remote add on +// the unchanged vendored skill reports unchanged, exits 0, and leaves the lock +// byte-unchanged. +func TestQuickstart_AddRemoteIdempotent(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + + if res := c.add(t, sampleSkill); res.exit != 0 { + t.Fatalf("first remote add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + lockBefore := readFile(t, filepath.Join(c.root, ".skillrig", "skills-lock.json")) + + second := c.add(t, sampleSkill) + if second.exit != 0 { + t.Fatalf("second remote add exit = %d, want 0 (stderr: %s)", second.exit, second.stderr) + } + + if !strings.Contains(second.stdout, "already vendored") && !strings.Contains(second.stdout, "no change") { + t.Errorf("idempotent remote re-add should note no change, got:\n%s", second.stdout) + } + + if after := readFile(t, filepath.Join(c.root, ".skillrig", "skills-lock.json")); after != lockBefore { + t.Errorf("lock changed on idempotent remote re-add:\nbefore=%s\nafter=%s", lockBefore, after) + } + + jsonRes := c.add(t, sampleSkill, "--json") + + obj := decodeJSON(t, jsonRes.stdout) + if obj["action"] != "unchanged" { + t.Errorf("--json action = %v on idempotent remote re-add, want unchanged", obj["action"]) + } +} + +// TestQuickstart_AddRemoteForceOnDivergence (US2) — a locally-modified vendored +// skill makes a plain re-add refuse with a --force hint (002 parity over the +// remote path); --force overwrites it back to the origin content. +func TestQuickstart_AddRemoteForceOnDivergence(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + + if res := c.add(t, sampleSkill); res.exit != 0 { + t.Fatalf("initial remote add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + skillMD := filepath.Join(c.root, vendoredPath, "SKILL.md") + appendByte(t, skillMD) + + refused := c.add(t, sampleSkill) + if refused.exit != 1 { + t.Fatalf("divergent remote re-add (no --force) exit = %d, want 1 (stderr: %s)", refused.exit, refused.stderr) + } + + if refused.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", refused.stdout) + } + + assertContains(t, "fix", refused.stderr, "--force") + + // --force restores the origin content and reports overwritten. + forced := c.add(t, sampleSkill, "--force", "--json") + if forced.exit != 0 { + t.Fatalf("forced remote add exit = %d, want 0 (stderr: %s)", forced.exit, forced.stderr) + } + + if obj := decodeJSON(t, forced.stdout); obj["action"] != "overwritten" { + t.Errorf("--force --json action = %v, want overwritten", obj["action"]) + } + + originMD := readFile(t, filepath.Join(sampleOriginDir(t), "skills", sampleSkill, "SKILL.md")) + if readFile(t, skillMD) != originMD { + t.Errorf("--force should restore the remote skill to the origin's content") + } +} + +// TestQuickstart_AddRemoteDryRun (US2, C6/FR-020) — a remote add --dry-run prints +// a bounded preview, exits 0, and leaves the working tree + lock byte-unchanged. +func TestQuickstart_AddRemoteDryRun(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + + res := c.add(t, sampleSkill, "--dry-run") + if res.exit != 0 { + t.Fatalf("remote add --dry-run exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + if !strings.Contains(res.stdout, "would vendor") { + t.Errorf("dry-run human output should be prefixed 'would vendor …', got:\n%s", res.stdout) + } + + // Bounded preview. + if lines := nonEmptyLines(res.stdout); len(lines) > 2 { + t.Errorf("dry-run preview has %d lines, want <= 2:\n%s", len(lines), res.stdout) + } + + // Nothing written: no .agents tree, no lock, and a clean working tree. + if _, err := os.Stat(filepath.Join(c.root, ".agents")); !os.IsNotExist(err) { + t.Errorf(".agents/ must not exist after remote --dry-run, stat err = %v", err) + } + + if _, err := os.Stat(filepath.Join(c.root, ".skillrig", "skills-lock.json")); !os.IsNotExist(err) { + t.Errorf("lock must not exist after remote --dry-run, stat err = %v", err) + } + + if porcelain := statusPorcelain(t, c.root); porcelain != "" { + t.Errorf("working tree not clean after remote --dry-run:\n%s", porcelain) + } + + jsonRes := c.add(t, sampleSkill, "--dry-run", "--json") + + obj := decodeJSON(t, jsonRes.stdout) + requireKeys(t, obj, addResultKeys...) + + if obj["dryRun"] != true { + t.Errorf("--json dryRun = %v, want true", obj["dryRun"]) + } +} + +// TestQuickstart_AddPinnedReproducible (US3, SC-004) — pinning the release tag on +// TWO clean consumers yields byte-identical content and identical locks (same +// version/commit/treeSha), and that treeSha is the RAW-git ground truth of the +// tagged commit. +func TestQuickstart_AddPinnedReproducible(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + wantTree := o.rawTree(t, pinTag) + + first := newRemoteConsumer(t, o) + second := newRemoteConsumer(t, o) + + if res := first.add(t, sampleSkill, "--pin", "v"+sampleVersion); res.exit != 0 { + t.Fatalf("first pinned add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + if res := second.add(t, sampleSkill, "--pin", "v"+sampleVersion); res.exit != 0 { + t.Fatalf("second pinned add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + e1 := lockEntry(t, first.root, sampleSkill) + e2 := lockEntry(t, second.root, sampleSkill) + + // treeSha == raw-git ground truth AND identical across the two repos. + if e1["treeSha"] != wantTree || e2["treeSha"] != wantTree { + t.Errorf("pinned treeSha = %v / %v, want raw-git ground truth %s", e1["treeSha"], e2["treeSha"], wantTree) + } + + if e1["commit"] != e2["commit"] { + t.Errorf("pinned commit drifted across clean repos: %v vs %v", e1["commit"], e2["commit"]) + } + + // The recorded version is the resolved tag (pin honesty, data-model §3). + if e1["version"] != pinTag { + t.Errorf("pinned version = %v, want the resolved tag %s", e1["version"], pinTag) + } + + // Byte-identical vendored content across the two pinned consumers. + for _, f := range []string{"SKILL.md", "check.sh"} { + if readSkillFile(t, first.root, f) != readSkillFile(t, second.root, f) { + t.Errorf("pinned vendored %s differs across clean repos", f) + } + } +} + +// TestQuickstart_AddPinTagFormEquivalent (US3, C3/SC-004) — `--pin v1.4.0` +// (bare-semver expansion) and `--pin -v1.4.0` (full-tag literal) resolve +// to the SAME commit and treeSha, confirming the deterministic --pin rule +// end-to-end over the file:// origin. +func TestQuickstart_AddPinTagFormEquivalent(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + + bare := newRemoteConsumer(t, o) + full := newRemoteConsumer(t, o) + + if res := bare.add(t, sampleSkill, "--pin", "v"+sampleVersion); res.exit != 0 { + t.Fatalf("bare-semver pin add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + if res := full.add(t, sampleSkill, "--pin", pinTag); res.exit != 0 { + t.Fatalf("full-tag pin add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + bareEntry := lockEntry(t, bare.root, sampleSkill) + fullEntry := lockEntry(t, full.root, sampleSkill) + + if bareEntry["commit"] != fullEntry["commit"] { + t.Errorf("bare-semver vs full-tag commit differ: %v vs %v (must be the same tag)", bareEntry["commit"], fullEntry["commit"]) + } + + if bareEntry["treeSha"] != fullEntry["treeSha"] { + t.Errorf("bare-semver vs full-tag treeSha differ: %v vs %v (SC-004)", bareEntry["treeSha"], fullEntry["treeSha"]) + } +} + +// TestQuickstart_AddPinNotFound (US3, C2/FR-015) — pinning a non-existent version +// fails exit 1 with the distinct NoSuchVersionError rendering ("has no version" + +// the pin-does-not-resolve why), NOT the skill-not-found message. The skill and +// the repo exist; only the requested tag does not. +func TestQuickstart_AddPinNotFound(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + + res := c.add(t, sampleSkill, "--pin", "v9.9.9") + if res.exit != 1 { + t.Fatalf("pin-not-found add exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + // The NoSuchVersionError rendering is the structured discriminator: it names + // the missing version and cites the unresolved pin — distinct from a + // skill-not-found ("not found in the origin") class (C2). + assertContains(t, "what", res.stderr, "has no version") + assertContains(t, "why", res.stderr, "the pin does not resolve") + + if strings.Contains(res.stderr, "not found in the origin") { + t.Errorf("pin-not-found must NOT render as skill-not-found (C2: distinct classes), got:\n%s", res.stderr) + } + + // The raw git cause is surfaced under --verbose, never swallowed. + verbose := c.add(t, sampleSkill, "--pin", "v9.9.9", "--verbose") + if verbose.exit != 1 { + t.Errorf("--verbose pin-not-found exit = %d, want 1", verbose.exit) + } +} + +// TestQuickstart_AddAuthFailureDistinct (US4) — an injected clone auth failure +// renders as an AUTHENTICATION failure (distinct from not-found/unreachable), +// pointing at gh auth login / GITHUB_TOKEN; exit 1. +func TestQuickstart_AddAuthFailureDistinct(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + + res := c.addWithFakeGit(t, + "remote: Authentication failed for 'https://github.com/my-org/my-skills/'", + sampleSkill) + + if res.exit != 1 { + t.Fatalf("injected auth-failure add exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + assertContains(t, "what", res.stderr, "authentication failed") + // Distinct from unreachable / not-found, and points at the credential fix. + if strings.Contains(res.stderr, "could not reach") || strings.Contains(res.stderr, "not found") { + t.Errorf("auth failure must be distinct from unreachable/not-found, got:\n%s", res.stderr) + } + + if !strings.Contains(res.stderr, "gh auth login") && !strings.Contains(res.stderr, "GITHUB_TOKEN") { + t.Errorf("auth failure fix should point at gh auth login / GITHUB_TOKEN, got:\n%s", res.stderr) + } +} + +// TestQuickstart_AddUnreachableDistinct (US4) — an injected clone "could not +// resolve host" failure renders as an UNREACHABLE failure, distinct from +// auth/not-found; exit 1. +func TestQuickstart_AddUnreachableDistinct(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + + res := c.addWithFakeGit(t, + "fatal: unable to access origin: Could not resolve host: github.com", + sampleSkill) + + if res.exit != 1 { + t.Fatalf("injected unreachable add exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + assertContains(t, "what", res.stderr, "could not reach") + + if strings.Contains(res.stderr, "authentication failed") { + t.Errorf("unreachable must be distinct from auth, got:\n%s", res.stderr) + } +} + +// TestQuickstart_AddPrivateNotFoundHintsAuth (US4, D4) — an injected clone +// not-found with no resolved token renders a not-found that ALSO adds the "if +// private, authenticate" hint, so the agent is not sent to re-check a skill name +// when the real problem is a missing credential; exit 1. +func TestQuickstart_AddPrivateNotFoundHintsAuth(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + + res := c.addWithFakeGit(t, + "fatal: repository 'https://github.com/my-org/my-skills/' not found", + sampleSkill) + + if res.exit != 1 { + t.Fatalf("injected private-not-found add exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + assertContains(t, "what", res.stderr, "not found") + // The D4 subtlety: an unauthenticated not-found adds the authenticate hint. + if !strings.Contains(res.stderr, "authenticate") { + t.Errorf("private-not-found should add the 'if private, authenticate' hint, got:\n%s", res.stderr) + } +} + +// TestQuickstart_AddRemoteConventionMismatch (FIX-4/H1) — a file:// origin whose +// index.json declares skillrigConvention 2 is rejected by the remote add's +// convention gate end-to-end: exit 1 with the IncompatibleConvention what/why/fix, +// and NOTHING written (no .agents/, no .skillrig/ in the consumer). This proves +// gateRemoteConvention runs over the real remote-fetch path before any vendoring. +func TestQuickstart_AddRemoteConventionMismatch(t *testing.T) { + t.Parallel() + + o := newRemoteOriginConvention(t, 2) + c := newRemoteConsumer(t, o) + + res := c.add(t, sampleSkill) + if res.exit != 1 { + t.Fatalf("convention-2 remote add exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + // The IncompatibleConvention rendering: what (a convention mismatch), why, and + // the fix that points at updating skillrig. + assertContains(t, "what", res.stderr, "convention") + assertContains(t, "why", res.stderr, "why:") + assertContains(t, "fix", res.stderr, "update skillrig") + + // The gate runs BEFORE any write: no vendored tree and no lock must exist. + if _, err := os.Stat(filepath.Join(c.root, ".agents")); !os.IsNotExist(err) { + t.Errorf(".agents/ must not exist after a gated remote add, stat err = %v", err) + } + + if _, err := os.Stat(filepath.Join(c.root, ".skillrig")); !os.IsNotExist(err) { + t.Errorf(".skillrig/ must not exist after a gated remote add, stat err = %v", err) + } +} + +// assertVendoredMatchesFixture checks every vendored skill file under root is +// byte-identical to the committed sample-origin fixture, with modes preserved +// (the exec bit is part of the tree-SHA). The remote-fetched copy must match the +// origin source exactly. +func assertVendoredMatchesFixture(t *testing.T, root string) { + t.Helper() + + for _, f := range []string{"SKILL.md", "check.sh"} { + got := readSkillFile(t, root, f) + want := readFile(t, filepath.Join(sampleOriginDir(t), "skills", sampleSkill, f)) + + if got != want { + t.Errorf("vendored %s differs from the origin fixture", f) + } + + gotMode := fileMode(t, filepath.Join(root, vendoredPath, f)) + wantMode := fileMode(t, filepath.Join(sampleOriginDir(t), "skills", sampleSkill, f)) + + if gotMode != wantMode { + t.Errorf("vendored %s mode = %v, want %v", f, gotMode, wantMode) + } + } + + if execMode := fileMode(t, filepath.Join(root, vendoredPath, "check.sh")); execMode&0o111 == 0 { + t.Errorf("vendored check.sh lost its executable bit: mode = %v", execMode) + } +} + +// TestQuickstart_SearchRemoteFileOrigin (US1 over the remote substrate) — search +// against a file:// origin with NO local checkout fetches index.json over the +// real git transport (FIX-2's per-call catalog fetch) and lists the skill the +// origin publishes; exit 0 with the complete --json record. This proves search +// works end-to-end against a remote origin, not just a local catalog on disk. +func TestQuickstart_SearchRemoteFileOrigin(t *testing.T) { + t.Parallel() + + o := newRemoteOrigin(t) + c := newRemoteConsumer(t, o) + + res := run(t, runOpts{ + args: []string{"search", "--json"}, + cwd: c.root, + env: map[string]string{"SKILLRIG_ORIGIN": c.cloneURL}, + }) + if res.exit != 0 { + t.Fatalf("remote search exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + p := decodeSearch(t, res.stdout) + if names := searchNames(p); len(names) != 1 || names[0] != sampleSkill { + t.Errorf("remote search names = %v, want exactly [%s] (fetched from the file:// origin)", names, sampleSkill) + } + + // --json record is structurally complete (every field add needs). + obj := decodeJSON(t, res.stdout) + requireKeys(t, obj, "origin", "skills") + + rawSkills, ok := obj["skills"].([]any) + if !ok || len(rawSkills) == 0 { + t.Fatalf("remote search skills not a non-empty array: %v", obj["skills"]) + } + + entry, ok := rawSkills[0].(map[string]any) + if !ok { + t.Fatalf("remote search skills[0] not an object: %v", rawSkills[0]) + } + + requireKeys(t, entry, "name", "version", "namespace", "description", "topics", "path") +} + +// TestQuickstart_SearchRemoteFromNonGitDir (FIX-7) — search needs no git working +// tree: run from a PLAIN temp dir that is NOT a git repo, against a file:// origin +// with no local checkout, it fetches index.json over the real git transport and +// lists the skill; exit 0. A repo is only an optional local-checkout fast-path, so +// `skillrig search` must work outside one (especially against a remote origin). +func TestQuickstart_SearchRemoteFromNonGitDir(t *testing.T) { + t.Parallel() + requireGit(t) + + o := newRemoteOrigin(t) + + // A plain temp dir with NO `git init` — search must not require a repo root. + nonGitDir := t.TempDir() + + res := run(t, runOpts{ + args: []string{"search"}, + cwd: nonGitDir, + env: map[string]string{"SKILLRIG_ORIGIN": o.cloneURL}, + }) + if res.exit != 0 { + t.Fatalf("search from a non-git dir exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + // It lists the skill the file:// origin publishes (fetched, no local checkout). + if !strings.Contains(res.stdout, sampleSkill) { + t.Errorf("search from a non-git dir omits %q:\n%s", sampleSkill, res.stdout) + } + + if !strings.Contains(res.stdout, "skillrig add") { + t.Errorf("search listing missing the add footer hint:\n%s", res.stdout) + } +} diff --git a/test/skillcore_quickstart_test.go b/test/skillcore_quickstart_test.go index 461ce02..85c3c9a 100644 --- a/test/skillcore_quickstart_test.go +++ b/test/skillcore_quickstart_test.go @@ -31,7 +31,8 @@ const originRepo = "my-org/my-skills" // sampleSkill is the one skill the sample origin ships. const sampleSkill = "terraform-plan-review" -// sampleVersion is the version recorded in the fixture's skill.toml. +// sampleVersion is the version recorded in the fixture's SKILL.md frontmatter +// (metadata.x-skillrig.version). const sampleVersion = "1.4.0" // originSubtree is the origin-relative path whose git tree-object SHA is the @@ -278,7 +279,7 @@ var countsKeys = []string{"verified", "mismatch", "orphan", "missing", "dirty"} func assertVendoredMatchesOrigin(t *testing.T, c consumerRepo) { t.Helper() - for _, f := range []string{"SKILL.md", "skill.toml", "check.sh"} { + for _, f := range []string{"SKILL.md", "check.sh"} { got := readSkillFile(t, c.root, f) want := readFile(t, filepath.Join(c.originDir, "skills", sampleSkill, f)) diff --git a/test/testdata/sample-origin/index.json b/test/testdata/sample-origin/index.json new file mode 100644 index 0000000..5edc709 --- /dev/null +++ b/test/testdata/sample-origin/index.json @@ -0,0 +1,32 @@ +{ + "skillrigConvention": 1, + "origin": "my-org/my-skills", + "skills": [ + { + "name": "terraform-plan-review", + "version": "1.4.0", + "namespace": "my-org", + "description": "Review a terraform plan for risk and drift before apply, flagging destructive changes, IAM/security-policy edits, and resources that will be replaced rather than updated.", + "topics": [ + "platform-team", + "terraform", + "aws" + ], + "path": "skills/terraform-plan-review", + "requires": [ + { + "tool": "oxid", + "version": ">=0.4.0", + "source": "my-org/my-skills", + "manager": "mise" + }, + { + "tool": "terraform", + "version": ">=1.6", + "source": "hashicorp/terraform", + "manager": "mise" + } + ] + } + ] +} diff --git a/test/testdata/sample-origin/skills/terraform-plan-review/SKILL.md b/test/testdata/sample-origin/skills/terraform-plan-review/SKILL.md index d3867f3..f627411 100644 --- a/test/testdata/sample-origin/skills/terraform-plan-review/SKILL.md +++ b/test/testdata/sample-origin/skills/terraform-plan-review/SKILL.md @@ -1,6 +1,20 @@ --- name: terraform-plan-review description: Review a terraform plan for risk and drift before apply, flagging destructive changes, IAM/security-policy edits, and resources that will be replaced rather than updated. +metadata: + x-skillrig.namespace: my-org + x-skillrig.version: 1.4.0 + x-skillrig.convention-version: "1" + x-skillrig.topics: [platform-team, terraform, aws] + x-skillrig.requires: + - tool: oxid + version: ">=0.4.0" + source: my-org/my-skills + manager: mise + - tool: terraform + version: ">=1.6" + source: hashicorp/terraform + manager: mise --- # Terraform Plan Review diff --git a/test/testdata/sample-origin/skills/terraform-plan-review/skill.toml b/test/testdata/sample-origin/skills/terraform-plan-review/skill.toml deleted file mode 100644 index f82d3e7..0000000 --- a/test/testdata/sample-origin/skills/terraform-plan-review/skill.toml +++ /dev/null @@ -1,23 +0,0 @@ -# Per-skill machine-facing manifest. Vendors with the skill into consumer repos. -name = "terraform-plan-review" -version = "1.4.0" -namespace = "my-org" -description = "Review a terraform plan for risk and drift." - -# Deterministic discovery tags. -tags = ["platform-team", "terraform", "aws"] - -# Backing-CLI prerequisites: DECLARED, not installed. These tools are absent in -# the test environment on purpose — verify is integrity-only and MUST NOT check -# them (SC-006/FR-014), so a pass with these present still exits 0. -[[requires]] -tool = "oxid" -version = ">=0.4.0" -source = "my-org/my-skills" -manager = "mise" - -[[requires]] -tool = "terraform" -version = ">=1.6" -source = "hashicorp/terraform" -manager = "mise"