Skip to content

feat(query): --save-baseline / --baseline (B.6) — snapshots in .codemap.db#30

Merged
SutuSebastian merged 3 commits intomainfrom
feat/query-baselines
May 1, 2026
Merged

feat(query): --save-baseline / --baseline (B.6) — snapshots in .codemap.db#30
SutuSebastian merged 3 commits intomainfrom
feat/query-baselines

Conversation

@SutuSebastian
Copy link
Copy Markdown
Contributor

@SutuSebastian SutuSebastian commented May 1, 2026

Summary

Implements B.6 from docs/research/fallow.md § Tier B: snapshot a query result set, refactor, then diff. Four new flags on codemap query:

Flag What it does
--save-baseline[=<name>] Snapshot the rows. Name defaults to --recipe id; ad-hoc SQL needs =<name>. Re-saving overwrites in place.
--baseline[=<name>] Diff current rows vs the saved snapshot. Output: {baseline:{...}, current_row_count, added: [...], removed: [...]}.
--baselines List saved baselines (no rows_json payload).
--drop-baseline <name> Delete one.

Storage decision: DB, not files

Snapshots live in a new query_baselines table inside .codemap.db rather than .codemap/baselines/<recipe>.json. Driven by the brainstorm in this conversation and the SQL-index thesis:

Axis DB table (this PR) JSON files (rejected)
Thesis fit One file, one query surface Parallel artifact, dilutes the pitch
Gitignore .codemap.* already covers it Need new entry / new dir
Cross-baseline queries SQL JOIN N file reads + glue
Atomicity Single SQLite transaction fs temp-file dance
Format versioning Schema bump (already a primitive) Hand-rolled version field per file
Discoverability SELECT * FROM query_baselines works on day one New CLI subcommand to enumerate

SCHEMA_VERSION 4 → 5. The new table is intentionally absent from dropAll() so --full and future schema rebuilds preserve baselines (only index tables get dropped).

Composition

With Behaviour
--summary Collapses diff to {baseline:{...}, current_row_count, added: N, removed: N}
--changed-since <ref> Pre-filters current rows before the diff (PR-scoped delta against the saved snapshot)
--recipe + recipe actions Actions attach to added rows only — the rows the agent should act on
--group-by Mutually exclusive — different output shape

Diff identity

Per-row JSON.stringify equality. No fuzzy "changed" category in v1 (avoids the row-key heuristic; agents can re-derive richer diffs from the raw rows).

Test plan

  • bun run check passes (build, format:check, lint:ci, test, typecheck, test:golden — all 19 golden scenarios green; 4 new parser tests + 1 new db round-trip test).
  • End-to-end smoked against this clone:
    • Save default-name (--save-baseline -r fan-out) → {"saved":"fan-out","row_count":10,…}
    • List → shows the saved baseline with metadata
    • Diff against unchanged tree → {added:[],removed:[]}
    • Contrived diff (SELECT … LIMIT 5 saved → LIMIT 7 baseline'd) → 2 added rows
    • --summary diff → {added:2,removed:0}
    • Drop → list shows only the remaining baseline
    • bun:sqlite null vs better-sqlite3 undefined coercion handled (caught in the db test).
  • Schema bump documented in architecture.md § query_baselines + glossary entry + Schema Versioning.
  • Per Rule 10, rule + skill updated in lockstep across .agents/ and templates/agents/.
  • Minor changeset (schema bump per .agents/lessons.md).
  • CI green.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added query baseline management: save result snapshots with --save-baseline, compare against saved baselines with --baseline to display added/removed rows, list stored baselines with --baselines, and delete baselines with --drop-baseline.
    • Baselines persist across --full runs and schema changes.
  • Tests

    • Added comprehensive test coverage for baseline parsing, storage, and diff operations.
  • Documentation

    • Updated CLI reference, architecture guide, and skill documentation with baseline workflows and command examples.

…ap.db

Adds the four-flag baseline surface from docs/research/fallow.md B.6:

- --save-baseline[=<name>]   snapshot result rows (name = recipe id by default)
- --baseline[=<name>]        diff current result vs saved snapshot
- --baselines                list saved baselines (no rows_json payload)
- --drop-baseline <name>     delete one

Storage decision: snapshots live in a new `query_baselines` table inside
.codemap.db rather than parallel JSON files. Wins over the file-per-baseline
sketch:

- One on-disk artifact, no new gitignore entries
- Atomic writes (single SQLite txn)
- Cross-baseline queries are SQL JOINs
- No file format design / hand-rolled version field

Schema 4 → 5. The new table is intentionally absent from dropAll() so
baselines survive `--full` and future SCHEMA_VERSION rebuilds (only
index tables get dropped). Future schema changes to query_baselines
itself need an in-place migration.

Diff identity for v1 = canonical JSON.stringify(row). Output:
  {baseline:{name, recipe_id, row_count, git_ref, created_at},
   current_row_count,
   added: [...rows],
   removed: [...rows]}

Composes with everything: --summary collapses to {added:N, removed:N};
--changed-since filters before the diff; --baseline + --recipe attach
recipe `actions` to the `added` rows only (the rows the agent should
act on); --group-by is mutually exclusive with --baseline (different
output shape).

Tests cover parser shape for all four flags, db round-trip with
upsert / get / list / delete + the bun-vs-better-sqlite3 null/undefined
coercion. End-to-end smoked: save / list / diff (no change) / contrived
diff (5→7 rows) / --summary diff / drop, all under both --recipe and
ad-hoc-with-explicit-name modes.

Per Rule 10: rule + skill updated in lockstep across .agents/ and
templates/agents/.

Schema bump justifies a minor changeset per .agents/lessons.md
"changesets bump policy (pre-v1)".
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 1, 2026

🦋 Changeset detected

Latest commit: ff6986e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@stainless-code/codemap Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

Warning

Rate limit exceeded

@SutuSebastian has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 45 minutes and 54 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8497a528-bc85-4473-a882-631d61e8218c

📥 Commits

Reviewing files that changed from the base of the PR and between 64e3e98 and ff6986e.

📒 Files selected for processing (8)
  • .agents/lessons.md
  • .agents/skills/codemap/SKILL.md
  • README.md
  • docs/architecture.md
  • src/cli/cmd-query.test.ts
  • src/cli/cmd-query.ts
  • src/db.test.ts
  • templates/agents/skills/codemap/SKILL.md
📝 Walkthrough

Walkthrough

Introduces query baseline functionality to Codemap's query CLI command, enabling users to snapshot current query results to a persisted .codemap.db table, compare subsequent runs against saved baselines via JSON-stringified row identity, and manage baselines via list/delete operations. Schema version incremented to 5.

Changes

Cohort / File(s) Summary
Documentation Updates
.agents/rules/codemap.md, .agents/skills/codemap/SKILL.md, docs/architecture.md, docs/glossary.md, templates/agents/rules/codemap.md, templates/agents/skills/codemap/SKILL.md
Extended CLI reference with baseline workflow (--save-baseline, --baseline, --baselines, --drop-baseline) and diffing mechanics (per-row JSON equality, added/removed categorization). Updated recipe actions behavior to attach only to added rows under --baseline. Schema version bump to 5 and new query_baselines table documentation.
Changelog & README
.changeset/query-baselines.md, README.md
Standard changeset entry documenting minor version bump with baseline feature and CLI example additions to project README.
Database Layer
src/db.ts, src/db.test.ts
Schema increment to v5; new query_baselines table with metadata (name, recipe_id, sql, row_count, git_ref, created_at) and canonical rows_json snapshot. New CRUD exports: upsertQueryBaseline, getQueryBaseline, listQueryBaselines, deleteQueryBaseline. Comprehensive lifecycle tests.
CLI Implementation
src/cli/cmd-query.ts, src/cli/cmd-query.test.ts, src/cli/main.ts
Parser augmented to recognize baseline flags with validation (mutual exclusivity, mandatory naming for ad-hoc SQL). New command kinds (listBaselines, dropBaseline) and run-object fields (saveBaseline, baseline). runQueryCmd branches to baseline operations (snapshot persistence, diffing with set-membership identity, summary/full output modes). New exported handlers runListBaselinesCmd and runDropBaselineCmd. Extensive test coverage for parsing and error scenarios.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI Parser
    participant runQueryCmd
    participant Database
    
    rect rgba(100, 150, 200, 0.5)
    Note over User,Database: Save Baseline Workflow
    User->>CLI Parser: codemap query --save-baseline[=name] -r recipe
    CLI Parser->>runQueryCmd: { kind: 'run', saveBaseline: true|string, ... }
    runQueryCmd->>Database: Execute query for recipe
    Database-->>runQueryCmd: Current result rows
    runQueryCmd->>Database: upsertQueryBaseline(name, sql, rows_json, ...)
    Database-->>runQueryCmd: Baseline stored
    runQueryCmd-->>User: Snapshot confirmed
    end
    
    rect rgba(150, 100, 200, 0.5)
    Note over User,Database: Baseline Diff Workflow
    User->>CLI Parser: codemap query --baseline[=name] -r recipe
    CLI Parser->>runQueryCmd: { kind: 'run', baseline: true|string, ... }
    runQueryCmd->>Database: Execute query for recipe
    Database-->>runQueryCmd: Current result rows
    runQueryCmd->>Database: getQueryBaseline(name)
    Database-->>runQueryCmd: Saved baseline snapshot
    runQueryCmd->>runQueryCmd: Compute diff (JSON.stringify set membership)
    runQueryCmd-->>User: { added: [...], removed: [...] }
    end
    
    rect rgba(200, 150, 100, 0.5)
    Note over User,Database: Baseline Management
    User->>CLI Parser: codemap query --baselines
    CLI Parser->>runQueryCmd: { kind: 'listBaselines', ... }
    runQueryCmd->>Database: listQueryBaselines()
    Database-->>runQueryCmd: Metadata list
    runQueryCmd-->>User: Baselines (name, recipe_id, row_count, created_at)
    
    User->>CLI Parser: codemap query --drop-baseline name
    CLI Parser->>runQueryCmd: { kind: 'dropBaseline', name, ... }
    runQueryCmd->>Database: deleteQueryBaseline(name)
    Database-->>runQueryCmd: boolean (success)
    runQueryCmd-->>User: Deletion confirmed
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

enhancement, documentation

Poem

🐰 A baseline is born in .db's deep store,
Snapshots of queries from lore to lore,
Added and removed in diffing delight,
JSON strings compare both day and night,
Persist through the chaos, survive schema's might! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main feature addition: baseline snapshot and diff functionality for the query command with storage in .codemap.db.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/query-baselines

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 45 minutes and 54 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Two lessons appended after auditing this PR against .agents/rules/:

- Backticks inside SQL/help-text template literals — hit twice now
  (B.7 schema comment + B.6 help text). The cmd-query.ts help string
  and db.ts CREATE TABLE strings are both `template literals`; a
  Markdown-style `--flag` code-fence inside terminates the literal
  early and TypeScript explodes several lines later with a cryptic
  "expected `,` or `)`". Lesson: use plain prose in those strings,
  or escape with \\\`.

- STOP-before-Grep applies to symbol lookups too — used Grep for
  `printQueryResult`, `getCurrentCommit`, `dropAll` in PR #30 when
  `SELECT … FROM symbols WHERE name = ?` was the right tool. The
  codemap rule already covers this; lesson clarifies that "symbol
  lookup" is the trigger, not "structural question."

Also slim two non-earning code comments per concise-comments rule.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cli/cmd-query.ts (1)

284-343: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make baseline mode mutually exclusive with --group-by.

runQueryCmd() handles saveBaseline/baseline before grouped execution, so codemap query --group-by owner --baseline -r fan-out currently returns an ungrouped baseline result and silently drops --group-by.

Suggested guard
   if (saveBaseline !== undefined && baseline !== undefined) {
     return {
       kind: "error",
       message:
         "codemap: --save-baseline and --baseline are mutually exclusive in one run.",
     };
   }
+  if (groupBy !== undefined && (saveBaseline !== undefined || baseline !== undefined)) {
+    return {
+      kind: "error",
+      message:
+        "codemap: --group-by cannot be combined with --save-baseline or --baseline.",
+    };
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/cmd-query.ts` around lines 284 - 343, Add a guard so baseline mode
cannot be used with grouped execution: in the same parsing block (the function
handling CLI args, near the existing saveBaseline/baseline checks) check if
groupBy !== undefined and (saveBaseline !== undefined || baseline !== undefined)
and return an error result (kind: "error") with a clear message like "codemap:
--group-by cannot be used with --save-baseline or --baseline." Reference the
existing variables saveBaseline, baseline, groupBy and the run branch that
returns { kind: "run", ... } so the new check runs before that branch is
returned.
🧹 Nitpick comments (1)
src/db.test.ts (1)

125-188: ⚡ Quick win

Add one assertion for the rebuild-survival path.

This test covers CRUD well, but the headline contract of the feature is that query_baselines survives --full / schema rebuilds because dropAll() leaves it behind. Without exercising that path once, a future schema refactor can break the marquee behavior while this suite still passes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/db.test.ts` around lines 125 - 188, Add a check that query_baselines
survive a full schema rebuild by invoking dropAll(db) after the initial upserts
and then asserting the baselines still exist via listQueryBaselines(db) and/or
getQueryBaseline(db, "fan-out"); specifically call dropAll(db) (the rebuild path
referenced in the comment) and then
expect(listQueryBaselines(db).map(b=>b.name)).toContain("fan-out") and/or
expect(getQueryBaseline(db, "fan-out")).toBeDefined() before continuing
deletions and closeDb.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/architecture.md`:
- Line 120: Update the docs to reflect that `--baseline --summary` still emits
`baseline` metadata and `current_row_count` in addition to added/removed counts;
edit the paragraph describing `--save-baseline`, `--baseline[=<name>]` and the
`--summary` behavior in the architecture doc (the block referencing
`--baseline[=<name>]` and the claimed output `{added: N, removed: N}`) so it
shows the actual payload shape `{baseline:{...}, current_row_count, added: N,
removed: N}` when `--summary` is used with `--baseline`; keep references to
`runQueryCmd`/`--summary`/`--baseline` to locate the text to change.

In `@README.md`:
- Around line 88-93: The README baseline section is missing three user-visible
behaviors: document that the full JSON diff output (when using --json
--baseline) includes a current_row_count field; state that running ad-hoc SQL in
baseline mode requires specifying a baseline name (e.g., when using --baseline
you must provide the saved name from --save-baseline); and explicitly note that
--group-by cannot be combined with --baseline. Update the examples and prose
around the commands shown (references: codemap query --save-baseline, codemap
query --json --baseline, codemap query --group-by, codemap query --baselines) to
mention these constraints and show a short example of the JSON diff including
current_row_count and an example of specifying a baseline name.

In `@src/cli/cmd-query.ts`:
- Around line 249-281: The listBaselines and dropBaselineName branches (the code
paths that return { kind: "listBaselines", json } and { kind: "dropBaseline",
... }) currently ignore flags like summary, changedSince, and groupBy; update
the guard conditions inside the listBaselines (checking listBaselines) and
dropBaselineName (checking dropBaselineName) branches to also reject if summary
!== undefined || changedSince !== undefined || groupBy !== undefined (in
addition to the existing checks against recipeId, printSqlId, saveBaseline,
baseline, i < rest.length, etc.), and return the same kind:"error" pattern with
an appropriate message so commands like "codemap query --summary --baselines"
fail instead of silently ignoring those flags.
- Around line 840-854: The diffRows implementation collapses duplicates by using
Set(JSON.stringify(row)), causing incorrect diffs for multisets; update diffRows
to perform multiset diffing by using frequency maps keyed by JSON.stringify
within the function (e.g., build baseCounts and curCounts maps), decrement
counts when matching, then reconstruct added as entries in current whose count
in baseCounts is exhausted and removed as entries in baseline whose count in
curCounts is exhausted; retain the same return shape ({ added, removed }) and
reference the diffRows function name so the change is localized.

In `@templates/agents/skills/codemap/SKILL.md`:
- Around line 222-230: Update the schema table entry for row_count in SKILL.md
so it correctly describes that row_count is the cached number of saved rows
(i.e., the count of entries represented by rows_json), not the character length
of rows_json; locate the table in templates/agents/skills/codemap/SKILL.md (the
row with "row_count | INTEGER") and change its Description to something like
"Cached number of saved rows" and mirror the exact same wording in
.agents/skills/codemap/SKILL.md.
- Around line 41-47: Update the SKILL.md flags list to explicitly state that
baseline mode cannot be combined with --group-by: find the section describing
"--baseline[=<name>]" and "--group-by owner|directory|package" and add a short
note such as "Note: --baseline (and --save-baseline) cannot be used together
with --group-by; these flags are mutually exclusive" so agents/clients won't
synthesize the invalid combined command; ensure the note appears adjacent to
both flag descriptions so readers of either entry see the restriction.

---

Outside diff comments:
In `@src/cli/cmd-query.ts`:
- Around line 284-343: Add a guard so baseline mode cannot be used with grouped
execution: in the same parsing block (the function handling CLI args, near the
existing saveBaseline/baseline checks) check if groupBy !== undefined and
(saveBaseline !== undefined || baseline !== undefined) and return an error
result (kind: "error") with a clear message like "codemap: --group-by cannot be
used with --save-baseline or --baseline." Reference the existing variables
saveBaseline, baseline, groupBy and the run branch that returns { kind: "run",
... } so the new check runs before that branch is returned.

---

Nitpick comments:
In `@src/db.test.ts`:
- Around line 125-188: Add a check that query_baselines survive a full schema
rebuild by invoking dropAll(db) after the initial upserts and then asserting the
baselines still exist via listQueryBaselines(db) and/or getQueryBaseline(db,
"fan-out"); specifically call dropAll(db) (the rebuild path referenced in the
comment) and then
expect(listQueryBaselines(db).map(b=>b.name)).toContain("fan-out") and/or
expect(getQueryBaseline(db, "fan-out")).toBeDefined() before continuing
deletions and closeDb.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d95033e4-26a5-4424-bdff-19220265e20e

📥 Commits

Reviewing files that changed from the base of the PR and between 09c6370 and 64e3e98.

📒 Files selected for processing (13)
  • .agents/rules/codemap.md
  • .agents/skills/codemap/SKILL.md
  • .changeset/query-baselines.md
  • README.md
  • docs/architecture.md
  • docs/glossary.md
  • src/cli/cmd-query.test.ts
  • src/cli/cmd-query.ts
  • src/cli/main.ts
  • src/db.test.ts
  • src/db.ts
  • templates/agents/rules/codemap.md
  • templates/agents/skills/codemap/SKILL.md

Comment thread docs/architecture.md Outdated
Comment thread README.md
Comment thread src/cli/cmd-query.ts
Comment thread src/cli/cmd-query.ts Outdated
Comment thread templates/agents/skills/codemap/SKILL.md
Comment thread templates/agents/skills/codemap/SKILL.md
…guards, doc payloads)

Six actionable + one nitpick (one Major), all verified correct.

Code:
- diffRows: switch from naive Set to multiset frequency-map diff. Naive
  Set([A,A]) vs Set([A]) reported no removal — wrong for non-DISTINCT
  queries (e.g. `SELECT name FROM symbols`). Now baseline [A,A] vs
  current [A] correctly reports removed: [A].
- Parser: --group-by + --save-baseline / --baseline now errors at parse
  time. Previously runQueryCmd routed to the baseline branch first and
  silently dropped --group-by.
- Parser: --baselines and --drop-baseline now reject --summary,
  --changed-since, and --group-by (in addition to the existing recipe
  / SQL / save / baseline checks). Was silently accepted-and-ignored.

Docs (synced across architecture.md, README.md, AND both copies of
SKILL.md per Rule 10):
- --baseline --summary payload corrected: includes baseline + current_row_count
  alongside added/removed counts (was documented as just {added: N, removed: N}).
- README baseline section calls out current_row_count, ad-hoc-needs-name,
  --group-by mutex.
- SKILL.md row_count description: "Cached length of rows_json" was
  ambiguous (could mean character length); now "Cached number of rows
  in the saved result set."
- SKILL.md --group-by description: "Mutually exclusive with
  --save-baseline / --baseline." Mirrored on the --baseline side too.
- rows_json description: "multiset diff identity (duplicate rows
  preserved)" instead of "set-diff identity = per-row JSON-stringify
  equality."

Tests:
- New diffRows multiset suite (6 cases including 3-of-3 duplicates and
  per-key independence).
- New parser tests: --group-by + baseline mutex, --baselines / --drop-baseline
  no-op-flag rejection.
- New db round-trip test: query_baselines survives dropAll() — the
  schema-rebuild contract that's the marquee of B.6.

Export diffRows so it can be unit-tested in isolation; runtime callers
already use it through the same module.
@SutuSebastian
Copy link
Copy Markdown
Contributor Author

Two CodeRabbit "outside diff range" findings also addressed in ff6986e:

  • src/cli/cmd-query.ts L284-343 (Major) — --group-by + --baseline mutex. Real bug: runQueryCmd checked saveBaseline / baseline first and routed to that branch, silently dropping --group-by. Now guarded at parse time:

    $ codemap query --json --group-by directory --baseline -r fan-out
    codemap: --group-by cannot be combined with --save-baseline or --baseline (different output shapes).
    
  • src/db.test.ts L125-188 (nitpick) — exercise the dropAll() survival path. Headline contract of B.6 is that baselines survive --full and SCHEMA rebuilds. Added a dedicated test that calls dropAll(db); createTables(db); and asserts the baseline still exists, so a future schema refactor can't silently break it.

@SutuSebastian SutuSebastian merged commit a309d52 into main May 1, 2026
9 checks passed
@SutuSebastian SutuSebastian deleted the feat/query-baselines branch May 1, 2026 07:47
SutuSebastian added a commit that referenced this pull request May 1, 2026
…chitecture skills (#32)

Two unrelated docs changes batched:

## 1. Plan: `codemap audit --base <ref>` (B.5)

Per `docs/README.md` Rule 3 (plans live in `plans/<feature-name>.md`, link from `roadmap.md`), drafts the design for **B.5** before writing any code. The research note explicitly calls this "the single highest-leverage candidate this refresh."

| Decision | v1 |
| --- | --- |
| **Snapshot strategy** | Temp worktree + full reindex under `.codemap.audit-<sha>/` (gitignored by the existing `.codemap.*` glob). Defers caching / perf-tuning until a real consumer hits the wall. |
| **Built-in deltas** | `files`, `dependencies`, `deprecated`, `visibility`, `barrels`, `hot_files`. Each wraps an existing recipe — no new analysis layer. |
| **Verdict** | `pass` / `warn` / `fail` with thresholds **opt-in via `codemap.config.audit`**. v1 emits raw deltas only (default `pass`). |
| **Exit codes** | `0` / `1` / `2` — mirrors `git diff --exit-code`. |
| **Composition** | `--json` / `--summary` work; `--changed-since` / `--group-by` / `--save-baseline` / `--baseline` are mutex (different shapes / semantics). |
| **Tracer-bullet sequence** | 7 commits: scaffold → worktree → first delta → remaining deltas → threshold config → docs+agents (Rule 10) → changeset. |

Both prerequisites just merged on `main`: B.6 (PR #30) proves the snapshot-in-DB primitive; B.7 (PR #28) provides the `symbols.visibility` column the `visibility` delta needs.

## 2. Adopt two Tier 3 skills from [`mattpocock/skills`](https://github.com/mattpocock/skills)

Sourced after evaluating three skills mid-thread; the two adopted ones earn their always-zero-cost slot:

| Skill | What |
| --- | --- |
| **`grill-me`** | 8-line interview-pattern skill. Walk a design tree branch by branch, recommend an answer per question, ask one at a time. Filled the gap visible in commit 1's plan: I made many decisions by myself; `grill-me` would have surfaced them for second opinion before they crystallised. |
| **`improve-codebase-architecture`** | Ousterhout-style deepening vocabulary (`module / interface / seam / adapter / depth / leverage / locality`), the deletion test, "one adapter = hypothetical seam, two = real," dependency categories (`DEEPENING.md`), and parallel-sub-agent "Design It Twice" interface exploration (`INTERFACE-DESIGN.md`). |

Both are maintainer-only (under `.agents/skills/` + `.cursor/skills/` symlinks per `agents-first-convention`). **Not added to `templates/agents/`** — same precedent as PR #25 (consumer surface ships only the codemap rule + skill).

### Translation notes

`improve-codebase-architecture/SKILL.md` adapted at three points to fit codemap's docs framework (the upstream version assumes `CONTEXT.md` + `docs/adr/`; we have neither):

- `CONTEXT.md` references → `docs/glossary.md` (Rule 9 already enforces glossary updates per PR).
- `docs/adr/` references → `docs/plans/<topic>.md` (Rule 3 — but plans are mortal; decisions of record lift to `architecture.md` per Rule 2 then the plan is deleted).
- "Offer ADR on rejection" step → dropped. Codemap doesn't keep decision records; the closest is "lift to architecture.md."

Companion files (`LANGUAGE.md`, `DEEPENING.md`, `INTERFACE-DESIGN.md`) ship **verbatim** — none reference `CONTEXT.md` or ADRs.

`grill-me/SKILL.md` extended with two short codemap-specific notes: prefer `codemap` over `Grep` when exploring (per the `codemap` rule), and write crystallised answers into the in-flight `docs/plans/<name>.md` inline (Rule 3).

### Skipped

- **`grill-with-docs`** (the third skill in the upstream "grill" family) — requires standing up CONTEXT.md / `docs/adr/` infrastructure that conflicts with the lift-to-architecture-then-delete-the-plan lifecycle codemap already runs. The salvageable ADR 3-criteria gate is recorded in this conversation; lift if codemap ever needs ADRs.

### Tier 3 list updated

`.agents/rules/agents-tier-system.md` Tier 3 list extended with both new skills, and the previously-missing `docs-governance` + `docs-lifecycle-sweep` entries from PR #25.

## Test plan

- [x] `bun run check` green (no behavior changed; pure docs + skills).
- [x] All cross-references resolve (plan → research → architecture / lessons; skill files → glossary.md / architecture.md / codemap rule / each other).
- [x] `.cursor/skills/{grill-me,improve-codebase-architecture}` symlinks resolve.
- [x] Plan calls itself out as **Plan** type per `docs/README.md § Document Lifecycle` — delete on ship, lift to `architecture.md`.
- [ ] CI green.
SutuSebastian added a commit that referenced this pull request May 1, 2026
)

* docs(research): refresh fallow.md + scan against current ship state

fallow.md gains a "Status snapshot (as of 2026-05-01)" section that
tabulates every adoption candidate's ship status — single source of
truth for "what's open" without munging the original tier tables.

Captures:
- Tier A all shipped (PR #26)
- B.5 partial (v1 in PR #33; --base <ref> + verdict deferred to v1.x)
- B.6 shipped (PR #30) — table-in-DB, not parallel JSON files
- B.7 shipped (PR #28) — landed on `symbols`, not `exports`
- B.8 / C.9 / C.10 / C.11 / D.* still as-was
- MCP server (agent-transports v1) shipped in PR #35 (adjacent —
  not a numbered fallow candidate but worth surfacing here)

§ 6 open questions: marks the 2 settled ones (actions ownership,
audit verdict default) with their resolution PRs; preserves the 2
still-open ones (coverage column shape, plugin layer scope).

§ 3 already-shipped block: updates the visibility-tags note to
acknowledge B.7 promoted it from regex to structured column instead
of saying "B.7 proposes promoting" (which it doesn't anymore).

competitive-scan-2026-04.md § 4: marks MCP server wrapping `query`
as ✅ shipped via PR #35 with a cross-link to fallow.md's status
snapshot. Other items still tracked there.

No behavior change; pure docs refresh to match current reality.

* docs(research): fix MD056 — D row in fallow.md status snapshot was 4 cells, header was 5

CodeRabbit caught: status-snapshot table header has 5 columns (Tier / # / Item / Status / Where) but the D.12-D.16 row only had 4 (collapsed Status + Where into one cell). Markdown parses that as a malformed table; renderers either drop the row or misalign neighbouring rows. Added the missing 5th cell pointing back at § 1's Defer / skip table for the per-row reasoning.

* docs(research): align B.7 row title to shipped column name (symbols.visibility)

CodeRabbit caught: Tier B table B.7 row title still said 'exports.visibility column' despite the body hedging '(or symbols)' AND the shipped column landing on symbols. Status snapshot row at L22 already says symbols. Updated the title to match shipped reality + added an explicit nod to the original hedge so the historical-record property survives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant