Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
2f05fae
docs(003): spec for Discover & Acquire (search + remote add) MVP
so0k May 30, 2026
69e6104
docs(003): clarify session 2026-05-31 — resolve 7 open decisions
so0k May 30, 2026
7664844
docs(003): fold in spikes S1-S4 — manifest→frontmatter + skillrig ind…
so0k May 30, 2026
fa62639
docs(003): implementation plan + Phase 0/1 design artifacts
so0k May 30, 2026
30f8e88
docs(003): record query-first search + tag→topic decision (Spike S5 p…
so0k May 30, 2026
a0ba8bb
docs(003): fold Spike S5 — query-first search, tag→topic, stdlib-only
so0k May 30, 2026
1e5712c
docs(003): plan build-seq — S5 matcher + origin-seeding step (npx ski…
so0k May 30, 2026
82f7bd8
chore(agents): verify-workflow reads brief + report template from disk
so0k May 30, 2026
cbfb574
docs(003): verify-workflow report + remediate all 14 findings
so0k May 30, 2026
429f0ca
feat(003): implement search + remote add + index (manifest→frontmatter)
so0k May 30, 2026
4bcd7f9
docs(003): checkpoint divergence review (post-implementation)
so0k May 30, 2026
650d915
docs(003): record adversarial deep-dive — P1 remote keystone not ship…
so0k May 30, 2026
18aebf2
docs: scrub skill.toml→SKILL.md frontmatter confusion; deprecate in C…
so0k May 31, 2026
5f90a82
fix(003): remediate adversarial deep-dive — remote keystone now real …
so0k May 31, 2026
c1a915a
test(003): close re-review F1/F2/F3; record remediation + re-review
so0k May 31, 2026
d836344
fix(003): address Qodo PR#8 — search git-repo/Args, token via GIT_CON…
so0k May 31, 2026
c3e95c7
docs: constitution 2.1.1 — scrub deprecated skill.toml terminology (c…
so0k May 31, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 107 additions & 0 deletions .agents/commands/specledger.verify-workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
description: EXPERIMENTAL — cross-artifact verification WITHOUT tasks.md. Fans out N INDEPENDENT reviewers (spec → plan/research/data-model/contracts/quickstart) via a deterministic Workflow, then MERGES their findings into one report (independent passes catch complementary issues). Read-only. Run from a FRESH session at DEFAULT effort.
handoffs:
- label: Implement (workflow)
agent: specledger.implement-workflow
prompt: Implement the verified feature via the workflow pipeline
---

## User Input

```text
$ARGUMENTS
```

Optional `$ARGUMENTS`: number of independent reviewers (default 2), or extra focus (e.g. "emphasize security", a feature id).

## Purpose

Read-only cross-artifact consistency verification for a feature whose **`tasks.md` was intentionally not generated** (e.g. when using `/specledger.implement-workflow`). It validates **`spec.md` against `plan.md`, `research.md`, `data-model.md`, `contracts/*.md`, and `quickstart.md`** — the planning artifacts — and produces a Specification Analysis Report.

**Why a workflow, not a single pass:** independent reviewers reliably catch *different* things. (Real example: one pass caught a committed-tree-vs-working-tree spec↔plan ambiguity; another caught a stale acceptance scenario the first missed.) So this runs **N independent reviewers in parallel** and **merges** them — keeping complementary findings, deduping overlap.

> **STRICTLY READ-ONLY.** Reviewers and the merge step report findings and recommend fixes; they do **not** edit artifacts. The only write is the optional, explicit save of the report (last step).
> **Do NOT flag the absence of `tasks.md`** — it is intentional for this flow. Skip the task↔requirement and `TestQuickstart_*`-task-mapping checks.

> **Pause for model and effort.** Subagents inherit the launcher's *model* (this script leaves `model` unset). Before launching, **AskUserQuestion which model** the reviewers should use and offer to pause to change effort because **effort is inherited** from the launching session — so a cheap, lower-effort session keeps the fan-out costs under control.

## What each reviewer checks (the four focus areas)

1. **Coverage** — every Functional Requirement (FR-*) and Success Criterion (SC-*) in `spec.md` is covered by the plan + a contract + a `TestQuickstart_*` scenario. Flag any requirement with no downstream coverage.
2. **Reverse traceability** — every contract behavior and every quickstart scenario traces back to a spec requirement (no invented behavior ungrounded in the spec).
3. **Consistency** — no contradictions across artifacts (exit codes, lock schema, fingerprint semantics, origin resolution, etc.); in particular flag any **AMBIGUITY that would let an implementer/model decide a behavior two different ways**.
4. **Decision integrity** — the spec's recorded clarification decisions are applied **consistently everywhere they appear**, with **no leftover stale wording**.

Constitution (`.specledger/memory/constitution.md`) is in scope: a MUST-principle conflict is automatically CRITICAL.

## Execution steps

1. **Locate artifacts**: run `sl spec info --json --paths-only`; read `FEATURE_DIR`. (The reviewers Read the artifacts themselves.)
2. **Discover relevant skills**: enumerate the skills available in the session (the available-skills list surfaced by the harness; or invoke `/find-skills` for a gap). **focus on design skills** — e.g. cobra, agentic CLI design, Supabase Architecture, REST and data modeling. Workflow subagents **do** have the `Skill` tool (verified empirically), so every review agent prompt **can and MUST** instruct the agent to load its relevant design and architecture skills via the `Skill` tool *before* reviewing artifacts. Record the review skills you'll bake into the brief.
3. **AskUserQuestion**: Batch which `model` for the reviewers (note effort is inherited from this session) together with the relevant skills (multiple selections allowed). Pass the `model` (or leave `model` unset to inherit) into the script.
4. **Write the feature-specific reviewer brief to disk** at `FEATURE_DIR/reviews/_reviewer-brief.md` (create `reviews/` if needed). It carries everything reviewer-facing — the **SKILLS line** (the chosen skills to load), the read-only rule, the artifact list, the feature context, the four focus areas, and the constitution note. **Why on disk:** the workflow script is plain JS, and embedding long multi-paragraph prompts as string literals is parse-fragile (a stray `/*` glob, an unescaped backtick/apostrophe, or a mis-counted paren breaks the whole script). Keeping the prose in a file makes the script tiny and robust, and gives a single inspectable/editable source of truth. The report template already lives on disk at `.specledger/templates/review-report-template.md` — reviewers/merge read it rather than re-deriving the format. *(Scaffolding files use a `_` prefix; offer to delete them after the run.)*
5. **Author + launch the Workflow** below (it just hands agents the on-disk paths).
6. When it returns, **present the merged report**. Then **offer to save** it (final step).

## Skill loading is mandatory (not optional)

> The reviewer brief MUST begin with a **`SKILLS:` line** naming the skills to invoke via the `Skill` tool and apply *before* reviewing. Design artifacts say *what* to build; the skills carry *how this repo designs it* — relying on the artifacts alone leaves that on the table. Workflow subagents have the `Skill` tool; do **not** distill skill content into the brief by hand and do **not** assume an agent will load a skill unprompted.

## Workflow pipeline (author this script)

> Keep the script **minimal**: it reads the on-disk brief + report template (step 4) and passes their paths to agents. Do **not** embed long prose, globs (`/*`), or multi-paragraph strings — that is the parse-fragility this disk-based design exists to avoid. Use an explicit `for`-loop to build the thunks (clearer paren-balance than a nested `parallel(Array.from(...))` one-liner).

```
export const meta = {
name: 'verify-artifacts',
description: 'Cross-artifact verification (no tasks): N independent reviewers + merge, reading on-disk brief + template',
phases: [{ title: 'Review' }, { title: 'Merge' }],
}

const FD = args.featureDir
const N = args.reviewers || 2
const MODEL = args.model // undefined → inherit launcher; or set from the AskUserQuestion answer
const BRIEF = FD + '/reviews/_reviewer-brief.md'
const TEMPLATE = '.specledger/templates/review-report-template.md'

const FINDINGS = { type: 'object', required: ['findings'], properties: {
findings: { type: 'array', items: { type: 'object',
required: ['category', 'severity', 'location', 'summary', 'recommendation'],
properties: {
category: { type: 'string' }, // Coverage|Traceability|Consistency|DecisionIntegrity|Constitution|Ambiguity
severity: { enum: ['CRITICAL', 'HIGH', 'MEDIUM', 'LOW', 'INFO'] },
location: { type: 'string' }, // file:line(s) or section refs
summary: { type: 'string' },
recommendation: { type: 'string' } } } },
coverageGaps: { type: 'array', items: { type: 'string' } }, // FR-*/SC-* with no downstream coverage
staleWording: { type: 'array', items: { type: 'string' } } } }

// Phase 1 — N INDEPENDENT reviewers (parallel, fresh context each), schema'd findings.
phase('Review')
const thunks = []
for (let i = 0; i < N; i++) {
const pass = i + 1
thunks.push(() => agent(
'Read ' + BRIEF + ' FIRST and follow it exactly (load the SKILLS it names via the Skill tool before reviewing, obey the read-only rule, read every artifact it lists). You are INDEPENDENT reviewer pass ' + pass + ' of ' + N + '. Perform the four-focus-area cross-verification described in the brief and return findings per the StructuredOutput schema, citing file:line. Edit nothing.',
{ schema: FINDINGS, model: MODEL, phase: 'Review', label: 'reviewer#' + pass }))
}
const passes = (await parallel(thunks)).filter(Boolean)

// Phase 2 — Merge: keep complementary findings, dedup overlap, reconcile severity, fill the on-disk template.
phase('Merge')
const report = await agent(
'Read the report template at ' + TEMPLATE + ' and emit a single filled copy as your output, following its structure EXACTLY. Merge ' + passes.length + ' independent review passes: KEEP complementary findings, DEDUP true overlaps (same location+claim), reconcile each severity to the highest justified. Fill the coverage table (one row per FR/SC), the decision-integrity checklist, the metrics, and next actions. STRICTLY READ-ONLY. The passes as schema JSON: ' + JSON.stringify(passes),
{ model: MODEL, phase: 'Merge', label: 'merge' })

return { report, reviewers: passes.length }
```

Pass `args: { featureDir: "<FEATURE_DIR>", reviewers: <N>, model: "<choice-or-undefined>" }`.

## Report format

The merge agent fills the on-disk template at **`.specledger/templates/review-report-template.md`** (findings table → coverage summary → decision integrity → metrics → next actions). Present that filled report to the user.

## Final step — offer to save (explicit, opt-in write)

After presenting the report, **AskUserQuestion**: save to `FEATURE_DIR/reviews/<spec-number>-review.md`? If yes, write the report with YAML frontmatter (`date`, `total_requirements`, `total_tasks: 0`, `coverage_pct`, `critical_issues`), creating `reviews/` if needed, and confirm the path. If a review already exists, offer to **merge into it** (mark resolved/open) rather than overwrite blindly.
75 changes: 75 additions & 0 deletions .agents/skills/qodo-manage-rules/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# qodo-manage-rules

The single skill for working with your org's **Qodo coding rules** — both *consuming* them
(load the rules relevant to a coding task and apply them while you code) and *administering*
them (modify / scope / deactivate a rule when it's wrong, over-broad, or stale).

See [`SKILL.md`](./SKILL.md) for the full workflow. This README covers **provenance** and
**setup** only.

## Why this exists (and what it replaces)

It replaces the upstream **`qodo-get-rules`** skill, which was vendored from
[`github.com/qodo-ai/qodo-skills`](https://github.com/qodo-ai/qodo-skills) — an
**abandoned** repository. That skill was both incomplete (read-only; no way to manage
rules) and **broken in practice**: it read the API token from `~/.qodo/config.json`'s
`API_KEY` field, which the current Qodo CLI does not write there. Rather than fork a dead
upstream, this skill is self-contained:

- correct auth (reads `~/.qodo/auth.key`, the file the Qodo CLI actually writes),
- adds the rule-management API (list / search / get / modify / deactivate), reverse-engineered
from the web portal and verified live (see [`references/api-contract.md`](./references/api-contract.md)),
- folds the "load rules for a coding task" job back in, done right (see
[`references/loading-rules.md`](./references/loading-rules.md)).

**Use this skill instead of `qodo-get-rules`.** The old one has been removed from this repo's
`skills-lock.json`.

## Setup — getting the API token

The skill authenticates with a raw `sk-...` bearer token at **`~/.qodo/auth.key`**
(or the `QODO_API_KEY` env var, which takes precedence).

The fastest way to mint that file is the **Qodo Command** CLI
([docs.qodo.ai/qodo-command](https://docs.qodo.ai/qodo-command)):

```sh
qodo login # opens a browser, authenticates, and writes ~/.qodo/auth.key
```

> ⚠️ **Qodo Command is itself sunset.** Its *chat* is dead — any chat message just returns a
> "this tool is sunset" notice. But `qodo login` still works and is the easiest way to
> create the API key. After login you don't need the CLI again; this skill talks to the
> rules API directly. (If you already have a token from the Qodo web portal, you can skip
> the CLI entirely and write it to `~/.qodo/auth.key` yourself, or export `QODO_API_KEY`.)

Verify it's in place (the token is 90 bytes, starts `sk-`):

```sh
ls -lah ~/.qodo/auth.key
```

## Configuration

| What | Source (highest precedence first) | Default |
|------|-----------------------------------|---------|
| **Token** | `$QODO_API_KEY` → `~/.qodo/auth.key` | — (required) |
| **API base** | `$QODO_API_URL` (or `QODO_API_URL` in `config.json`) → `$QODO_ENVIRONMENT_NAME` | `https://qodo-platform.qodo.ai/rules/v1` |

`~/.qodo/config.json` holds only Qodo CLI **UI preferences** (theme, etc.) — the token is
**not** there. Don't commit `~/.qodo/auth.key` or any HAR capture; this repo's `.gitignore`
already excludes `*.key` and `*.har`.

## Layout

```
qodo-manage-rules/
├── README.md # you are here — provenance + setup
├── SKILL.md # the workflow (consume + administer)
├── references/
│ ├── loading-rules.md # structured queries, two-query strategy, severity
│ ├── managing-derived-rules.md # modify-vs-deactivate decision tree + PR-triage playbook
│ └── api-contract.md # endpoints, auth, request/response shapes
└── scripts/
└── qodo_rules.py # stdlib-only client (list/get/search/find/load/set-state/update)
```
119 changes: 119 additions & 0 deletions .agents/skills/qodo-manage-rules/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
name: qodo-manage-rules
description: >-
The single skill for an org's Qodo coding rules — both CONSUMING them and ADMINISTERING
them. (1) LOAD the rules relevant to a coding task and apply them while writing code:
use whenever you're about to write, edit, refactor, or review code, or start planning an
implementation, and want to comply with the org's standards up front ("what rules apply
here", "load our coding rules", "check our conventions before I build this"). Skip if
rules are already loaded this session. (2) MANAGE the rule catalog — list, search,
inspect, modify (severity, content, scope), and deactivate/reactivate rules via the Qodo
rules API: use whenever a Qodo automated review flags something that is actually correct
or a deliberate decision and you want to fix the RULE not the code ("this Qodo rule is
wrong / over-broad / stale", "Qodo keeps flagging X", "the rule contradicts our
convention", "loosen / narrow / scope / carve-out that rule", "disable / deactivate /
turn off this rule", "change the rule from error to warning", "which rule produced this
review comment"), or to triage a declined PR review rule-by-rule. Trigger even when
"Qodo" isn't named, if the user is loading coding standards or reacting to an automated
review by changing the governing rule. This skill REPLACES the deprecated qodo-get-rules
skill (which had a broken auth lookup). NOT for posting PR comments.
---

# Qodo Rules

Qodo coding rules are **org-wide**: every teammate's PR is graded against them. This skill
is the one place to work with them, in two modes:

- **Consume** (`load`) — pull the rules relevant to what you're about to build and apply
them while coding, so you comply up front. *(This is the job the old `qodo-get-rules`
skill did; it's folded in here and fixed — see Auth.)*
- **Administer** (`list` / `find` / `get` / `update` / `set-state`) — change the rules
themselves when one is wrong, over-broad, or stale. Higher stakes: an edit changes what
everyone's review flags, so writes are dry-run by default.

## The one tool

All API plumbing is in `scripts/qodo_rules.py` (stdlib only, no deps). Use it rather than
hand-rolling curl — the write path is a **full-document PUT**, so the script does the
read-modify-write for you (a hand-built partial body would blank out other fields).

```sh
S=.claude/skills/qodo-manage-rules/scripts/qodo_rules.py

# CONSUME — load rules for the current task (apply while coding)
python3 $S load --scope /skillrig/cli/ \
--query $'Name: <topic>\nCategory: <Cat>\nContent: <what to check>' \
--query $'Name: <cross-cutting>\nCategory: <Cat>\nContent: <what to check>'

# ADMINISTER — read
python3 $S list --all
python3 $S get 782313
python3 $S search "verification offline" --scope /skillrig/cli/
python3 $S find "go:build integration" # resolve ruleId(s) from review text → enriched

# ADMINISTER — write (default DRY-RUN; add --apply to send the PUT)
python3 $S set-state 782685 inactive # deactivate (reversible — preferred over delete)
python3 $S set-state 782685 inactive --apply
python3 $S update 782313 --severity warning
python3 $S update 782313 --append-content "- Exception: the fetch layer is the feature."
python3 $S update 782313 --content-file /tmp/new_content.md --apply
```

Add `--json` for complete machine-readable output; `--limit N` bounds human rows.

## Auth (differs from the old qodo-get-rules)

The token is a raw `sk-...` bearer from **`~/.qodo/auth.key`** (or `$QODO_API_KEY`). It is
**not** in `~/.qodo/config.json` — that file is only UI prefs. The old `qodo-get-rules`
skill looked for `config.json:API_KEY`, which doesn't exist, so it was effectively broken in
this environment; that's the main reason this skill replaces it. The same bearer token does
both reads and writes (verified). The script never prints the token.

## Mode 1 — load rules for a coding task

Run `load` with two structured queries (a topic query + a cross-cutting query) right before
you write or plan code. It prints the relevant active rules grouped by severity; apply
ERROR (must), WARNING (should), RECOMMENDATION (consider). **Skip if rules are already
loaded this session** ("📋 Qodo Rules Loaded" in recent context). Empty result is valid —
proceed without constraints; never crash on no token / no network.

Full query-writing guidance (the Name/Category/Content format, category selection, the
two-query strategy, scope, and the severity-application table) is in
**`references/loading-rules.md`** — read it before composing queries.

## Mode 2 — manage a rule you disagree with

When a Qodo finding is actually a correct/deliberate decision, fix the rule, not the code:

1. **Find it.** `python3 $S find "<phrase from the review comment>"` → ruleId, severity,
state, and the `source` file it was derived from.
2. **Read it.** `python3 $S get <ruleId>` — confirm the content matches what was enforced.
3. **Decide** (see `references/managing-derived-rules.md` for the decision tree):
- **Modify** when the intent is right but too broad/strict → narrow `content` (carve-out)
or downgrade `severity`.
- **Deactivate** (`set-state inactive`) when the rule shouldn't apply here at all (e.g.
derived from a generic vendored skill and conflicting with the project's own
convention). Reversible — preferred over deletion; this skill exposes no hard delete.
4. **Preview, then apply.** Writes default to a dry-run printing before/after. Confirm, then
re-run with `--apply`.
5. **Fix the source.** Qodo re-derives rules from files (`CLAUDE.md`, vendored `SKILL.md`s,
design docs). Update the `source` file too, or the rule comes back wrong next scan.

## Safety

- **Default to dry-run.** Pass `--apply` only after the diff is shown and clearly correct —
these edits are org-wide.
- **Prefer deactivate over delete.** No undo for a hard delete, so this skill doesn't offer
one; deactivation is a clean round-trip.
- **Confirm gate-weakening writes with the user** — severity downgrades on `error` rules and
deactivations weaken the org's gates; call that out.
- **Never paste the token**, commit it, or write it to a tracked file.

## References

- `references/loading-rules.md` — Mode 1: structured query format, two-query strategy,
scope, and severity application.
- `references/managing-derived-rules.md` — Mode 2: modify-vs-deactivate decision tree, the
"rules are derived from source files" model, and the worked PR-triage playbook.
- `references/api-contract.md` — endpoints, auth resolution, request/response shapes, and
the list-vs-search schema gotcha (`ruleId` vs `id`).
Loading