Skip to content

docs(plan): codemap audit --base <ref> — worktree + reindex strategy (v1.x backlog)#51

Merged
SutuSebastian merged 2 commits intomainfrom
docs/plan-audit-base
May 3, 2026
Merged

docs(plan): codemap audit --base <ref> — worktree + reindex strategy (v1.x backlog)#51
SutuSebastian merged 2 commits intomainfrom
docs/plan-audit-base

Conversation

@SutuSebastian
Copy link
Copy Markdown
Contributor

@SutuSebastian SutuSebastian commented May 3, 2026

Summary

Plan for codemap audit --base <ref> — the v1.x backlog item that closes the highest-frequency agent loop left open after PR #50: PR-review structural diff.

Today, comparing the working tree to `origin/main` is a 3-step `--baseline` dance (switch branches, reindex, save baselines, switch back). `--base ` collapses it to one verb that:

  1. Resolves `` via `git rev-parse --verify` (clean error on non-git or unresolvable ref)
  2. Materialises the ref via `git worktree add` to a sha-keyed cache dir under `.codemap/audit-cache/` (does NOT touch the user's working tree)
  3. Runs `runCodemapIndex` against the worktree (full mode; cache miss only)
  4. Runs the same canonical SQL the existing audit deltas already use, on both DBs
  5. Diffs row-sets via the same `diffRows` helper PR feat: codemap audit v1 (B.5) — structural-drift command with per-delta baselines #33 already ships
  6. Returns the same `{head, deltas: {key: {base, added, removed}}}` envelope, with `base.source: "ref"` (new value) + `base.ref` (new field)

Why

PR #50 just shipped `codemap impact` — the "blast radius" loop. The remaining high-frequency agent loop without a one-verb solution is "what changed structurally in this PR?" `--base ` reuses 90% of the audit infrastructure (only new piece is the worktree + reindex snapshot path) and lights up MCP + HTTP automatically via the existing `tool-handlers.ts` dispatcher.

See `docs/plans/audit-base.md` for the full design — sketched API, output envelope, 10 decisions (D1-D10), 5 tracers, perf considerations, alternatives rejected, and out-of-scope list.

Test plan

  • CodeRabbit reviews the design choices (especially D1 — worktree vs in-place checkout, D2 — sha-keyed cache + LRU eviction, D4 — non-git hard error, D7 — per-delta override composition)
  • Once approved + merged, follow-up PR implements Tracers 1-5

Summary by CodeRabbit

Release Notes

  • Documentation
    • Added design documentation outlining the audit --base feature specification
    • Updated project roadmap with reference to new design plan

…(v1.x backlog)

Plan for the next-best agent-value loop: PR-review structural-diff. Replaces today's 3-step --baseline dance with one verb. Reuses 90% of the existing audit infrastructure (PR #33); only new piece is the worktree+reindex snapshot path.

Cache-by-resolved-sha; LRU 5/500 MiB; mutual-exclusive with --baseline; per-delta override compatible. Hard error on non-git projects (no graceful fallback — there's no meaningful 'ref' without git).

Plan only — implementation follows after CodeRabbit review per the impact (#49#50) / watch (#46#47) workflow.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 3, 2026

⚠️ No Changeset found

Latest commit: 9c92b05

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

Warning

Rate limit exceeded

@SutuSebastian has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 36 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c95fc21a-d005-465f-9e76-cf9fdf351bd8

📥 Commits

Reviewing files that changed from the base of the PR and between 61eec22 and 9c92b05.

📒 Files selected for processing (1)
  • docs/plans/audit-base.md
📝 Walkthrough

Walkthrough

Added a comprehensive design plan (docs/plans/audit-base.md) for the codemap audit --base <ref> feature, specifying CLI interface, output shape, git resolution strategy, cache behavior, error handling, acceptance criteria, and alternatives. Updated roadmap to reference the new plan.

Changes

Documentation: audit --base Feature Plan

Layer / File(s) Summary
Feature Design Plan
docs/plans/audit-base.md
New design document defining the --base <ref> CLI flag, output envelope structure with base.source="ref" and base.ref metadata, worktree + reindex strategy, sha-keyed caching, ref resolution, error cases, mutual exclusivity with --baseline, and acceptance tracers.
Roadmap Update
docs/roadmap.md
Backlog item for codemap audit --base <ref> (v1.x) rewritten to remove parenthetical deferral note and link directly to plans/audit-base.md.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

A rabbit sketches audits fine,
With base refs in a grand design,
From worktrees spawned and caches blessed,
The diff-catching plan is manifest. 🐰📋

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main change: it introduces a design plan for the codemap audit --base <ref> feature with a worktree and reindex strategy as a v1.x backlog item.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/plan-audit-base

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 36 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@SutuSebastian
Copy link
Copy Markdown
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
docs/plans/audit-base.md (2)

55-78: ⚡ Quick win

Clarify backwards-compat/type changes needed for AuditBase (source: "ref" + new base.ref).

The plan correctly states base.source becomes an enum extension with "ref" (Lines 77-78) and adds base.ref (Lines 63-66). However, to prevent downstream breakage, it would help to explicitly call out that the TypeScript AuditBase/AuditEnvelope types must be updated so source accepts "baseline" | "ref" and base.ref exists for "ref" audits (or is optional).

Without that, consumers that currently assume only source: "baseline" may fail compilation or mishandle runtime shapes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/audit-base.md` around lines 55 - 78, Update the TypeScript types
for AuditBase/AuditEnvelope so the per-delta discriminator base.source accepts
the union "baseline" | "ref" and ensure base.ref is present (or at least
optional) on the shape used when base.source === "ref"; convert AuditBase to a
discriminated union (or widen the interface) that includes a variant with
base.source: "ref" and base.ref: string (and keep the existing baseline variant)
so consumers that narrow on base.source handle the new shape without type
errors.

109-110: ⚡ Quick win

Document CODEMAP_AUDIT_CACHE_SIZE semantics (or remove the mention).

You mention CODEMAP_AUDIT_CACHE_SIZE (Line 109-110) as “defer to env var + sane defaults” but don’t specify:

  • whether it’s a hard cap or soft limit,
  • format (MiB? bytes?),
  • default value,
  • units, and how it interacts with the “5 entries OR 500 MiB” eviction (Line 84, 109-110).

Given the repo’s docs guidelines for config options in *.md, please either fully document it (with an example) or drop the env-var detail until v1.x+ when it’s actually implemented.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/audit-base.md` around lines 109 - 110, Update the docs to either
fully specify CODEMAP_AUDIT_CACHE_SIZE or remove its mention: if keeping it, add
semantics for CODEMAP_AUDIT_CACHE_SIZE (environment variable), stating whether
it is a hard cap or soft target, the accepted format/units (e.g., integer bytes
or suffixes like MiB/GiB), the default value, how it interacts with the existing
“5 entries OR 500 MiB” eviction policy (which wins when limits conflict), and a
small example export line; if you prefer to defer, remove the reference to
CODEMAP_AUDIT_CACHE_SIZE entirely so the doc doesn’t promise an unimplemented
config option.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/plans/audit-base.md`:
- Around line 83-91: Clarify and reconcile D1/D2/D8 by specifying that the
worktree created via `git worktree add` is ephemeral and removed immediately
after producing a cached index: on cache miss create worktree at
`<projectRoot>/.codemap/audit-cache/<sha>/worktree-<tmp>` run
`runCodemapIndex({mode: "full"})` to build a temp `.codemap.db` stored as
`<projectRoot>/.codemap/audit-cache/<sha>/.codemap.db`, then `git worktree
remove --force <path>` in a `finally` while leaving the cached `.codemap.db` and
metadata (LRU timestamp/size) intact for future cache hits; explain that cache
hits use the cached `.codemap.db` (no worktree present) to skip reindexing, and
that eviction (LRU after 5 entries or 500 MiB) removes both the cached DB file
and any stale worktree dirs if present.
- Line 84: The sha-keyed cache can race when two processes resolve the same sha
and both create/populate <projectRoot>/.codemap/audit-cache/<sha>/; implement an
atomic per-sha strategy: acquire a per-sha lockfile (e.g., .lock inside the
<sha> directory or a global lockdir with atomic mkdir) before any populate,
write the temp DB/artifacts to a temp path (e.g., <sha>.tmp.<pid>), fsync/close
them, then atomically rename into the final <sha> directory so readers never see
partial state; readers should treat a missing final dir as a cache miss and, if
they encounter a lockfile, either wait with timeout/retry or single-flight by
checking the lock and then re-checking the final dir after lock release; apply
this for the code that resolves the cache key (the --base → resolved sha logic)
and the cache eviction routine (LRU/size checks) to avoid races during eviction.

---

Nitpick comments:
In `@docs/plans/audit-base.md`:
- Around line 55-78: Update the TypeScript types for AuditBase/AuditEnvelope so
the per-delta discriminator base.source accepts the union "baseline" | "ref" and
ensure base.ref is present (or at least optional) on the shape used when
base.source === "ref"; convert AuditBase to a discriminated union (or widen the
interface) that includes a variant with base.source: "ref" and base.ref: string
(and keep the existing baseline variant) so consumers that narrow on base.source
handle the new shape without type errors.
- Around line 109-110: Update the docs to either fully specify
CODEMAP_AUDIT_CACHE_SIZE or remove its mention: if keeping it, add semantics for
CODEMAP_AUDIT_CACHE_SIZE (environment variable), stating whether it is a hard
cap or soft target, the accepted format/units (e.g., integer bytes or suffixes
like MiB/GiB), the default value, how it interacts with the existing “5 entries
OR 500 MiB” eviction policy (which wins when limits conflict), and a small
example export line; if you prefer to defer, remove the reference to
CODEMAP_AUDIT_CACHE_SIZE entirely so the doc doesn’t promise an unimplemented
config option.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 044362b6-cb36-4975-9538-304f36120ea0

📥 Commits

Reviewing files that changed from the base of the PR and between 90092ae and 61eec22.

📒 Files selected for processing (2)
  • docs/plans/audit-base.md
  • docs/roadmap.md

Comment thread docs/plans/audit-base.md Outdated
Comment thread docs/plans/audit-base.md Outdated
…, worktree-as-cache lifecycle clarity, TS type widening callout

- D1/D2/D8 rewritten: worktree IS the cache entry (kept until LRU evicts);
  cleanup runs only on reindex failure rollback OR LRU eviction. The
  earlier ambiguity (D2 said 'cache by sha' while D8 said 'remove in
  finally') is resolved.
- D11 added: atomic cache populate via per-pid temp dir + POSIX rename
  → free single-flight semantics; no lock files needed. Same pattern
  for eviction. Closes the race CodeRabbit flagged on concurrent
  CI matrix runs against the same sha.
- AuditBase TS type widening to discriminated union called out
  explicitly above the Decisions table (Tracer 1 ships it).
- CODEMAP_AUDIT_CACHE_SIZE env var mention dropped — was promising an
  unimplemented config knob; v1 hardcodes the limits, defer to v1.x+.
@SutuSebastian
Copy link
Copy Markdown
Contributor Author

Triaged the 4 findings (1 Major + 3 Quick wins) per `pr-comment-fact-check`. All 4 had merit — applied in 9c92b05.

Applied (4 of 4):

  • Major — race on sha-keyed cache populate. Added D11 ('Atomic cache populate'): per-pid temp dir + POSIX `rename` gives free single-flight semantics, no lock files needed. Same pattern for LRU eviction. Concurrent CI matrix runs against the same sha now safe by construction.
  • Major (lifecycle ambiguity). Rewrote D1/D2/D8 — the earlier text said both 'cache by sha' AND 'cleanup in finally' which was self-contradictory. Now: worktree IS the cache entry, kept until LRU evicts; cleanup runs only on reindex failure rollback OR LRU eviction.
  • Nitpick — `AuditBase` TS type widening. Added explicit callout above the Decisions table that `AuditBase` becomes a discriminated union (`{source: 'baseline', name, ...} | {source: 'ref', ref, ...}`) shipped in Tracer 1.
  • Nitpick — `CODEMAP_AUDIT_CACHE_SIZE` underspecified. Dropped the env var mention. v1 hardcodes the limits; defer config surface to v1.x+ if real consumers ask. Plan no longer promises an unimplemented knob.

Plan PR ready for re-review or merge.

@SutuSebastian SutuSebastian merged commit e107bea into main May 3, 2026
10 checks passed
@SutuSebastian SutuSebastian deleted the docs/plan-audit-base branch May 3, 2026 13:37
SutuSebastian added a commit that referenced this pull request May 3, 2026
…impl of PR #51 plan) (#52)

* feat(audit): runAuditFromRef + worktree cache (Tracer 1)

* feat(audit): cmd-audit.ts --base <ref> parser + runner wired (Tracer 2)

- parseAuditRest gains --base <ref> / --base=<ref>; rejects --base + --baseline as mutually exclusive (clean error message); per-delta --<delta>-baseline still composes orthogonally.
- runAuditCmd dispatches to runAuditFromRef when opts.base is set; otherwise unchanged baseline-prefix path.
- makeWorktreeReindex injects the runtime save/restore around the worktree-side reindex (initCodemap mutates global state — restore needed so live-DB ops after the reindex still target the original project).
- main.ts forwards parsed.base.
- Help text updated with --base examples.
- .gitignore: .codemap/audit-cache/ (don't track cached worktrees; .codemap/recipes/ stays tracked).
- Smoke: bun src/index.ts audit --base HEAD~1 --json --summary works on this repo (4 files added vs HEAD~1, 7 deps, 0 deprecated). Second run hits cache (~110ms).
- 6 new parser tests cover --base alone, --base=, mutual exclusivity, per-delta composition, missing value.

* feat(audit): MCP audit tool gains base?: z.string() (Tracer 3)

- audit-engine.ts: makeWorktreeReindex lifted from cmd-audit.ts (single source of truth — both transports inject the same factory).
- tool-handlers.ts: handleAudit gains base/baseline_prefix mutual-exclusion guard + dispatches to runAuditFromRef when args.base set.
- mcp-server.ts: audit tool description rewritten to document the 3 snapshot sources (base / baseline_prefix / baselines per-delta) + their composition rules.
- 2 new MCP tests: rejects base + baseline_prefix combo, returns clean error in non-git project.
- HTTP POST /tool/audit lights up automatically via the existing dispatcher (Tracer 4 next).

* feat(audit): HTTP POST /tool/audit auto-wired (Tracer 4)

The dispatcher already validates with auditArgsSchema (which gained the
`base?: string` field in Tracer 3) — this commit just adds tests:
- audit + base + baseline_prefix combo → 400 mutually exclusive
- audit + base in non-git project → 400 'requires a git repository'

Both transports (MCP + HTTP) now expose the new --base path through the
same handleAudit pure function.

* docs(audit-base): sync README + architecture + glossary + agent rule/skill + changeset (Tracer 5)

- README.md: 'audit --base origin/main' code block + cache caveat
- docs/architecture.md: Audit wiring extended with --base details (worktree
  + atomic populate + cache + LRU + non-git error + GIT_* env hygiene);
  audit-engine.ts inventory mentions runAuditFromRef + makeWorktreeReindex;
  audit-worktree.ts added to application/ inventory
- docs/glossary.md: 'audit --base / git-ref baseline' entry; existing 'audit'
  entry now mentions 3 mutually-exclusive sources
- docs/roadmap.md: drop the v1.x backlog item (it shipped)
- docs/research/fallow.md: B.5 row updated to 'v1 + v1.x partial shipped';
  Adjacent-shipped bullet referencing PR #51 (plan) + #52 (impl)
- .agents/rules/codemap.md + templates/agents/rules/codemap.md: new
  'Audit vs git ref' table row + paragraph rewrite (Rule 10 lockstep)
- .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md:
  audit tool args + behavior reflect new 'base' field (Rule 10 lockstep)
- src/agents-init.ts: ensureGitignoreCodemapPattern now adds
  '.codemap/audit-cache/' alongside '.codemap.*' (the audit cache lives
  inside .codemap/ but .codemap/recipes/ stays git-tracked); test updated.
- .changeset/codemap-audit-base.md: minor — full design rationale
- docs/plans/audit-base.md: deleted per docs-governance Rule 3

* fix(audit-base): address all 6 CodeRabbit review comments + 2 nitpicks

Threads (all 6 actionable, no hallucinations):

1. .agents/rules + templates: stale 'B.6' spec leak → 'query_baselines entries' (consistent terminology)
2. audit-engine.ts: process-level mutex (_reindexChain promise queue) around makeWorktreeReindex's runtime-singleton swap. Concurrent HTTP/MCP audits no longer interleave initCodemap/configureResolver state. Long-term cleanup (runCodemapIndex takes explicit context) tracked separately.
3. audit-worktree.test.ts: commitFiles fixture now checks git commit exit code; throws with stderr on failure (was silently passing against previous commit)
4. audit-worktree.ts: tmpName gains randomUUID() suffix (defensive against cross-process races; in-process serialised by mutex from #2)
5. audit-worktree.ts: evictIfOverLimits accepts protectPath; populateWorktree passes finalPath so the freshly-populated entry can't evict itself when it alone exceeds MAX_CACHE_BYTES. New test covers the protect path.
6. mcp-server.ts + .agents/skills + templates: rephrased 'three mutually exclusive sources' to 'two mutually exclusive PRIMARY sources + optional per-delta overrides' (matches handleAudit's actual exclusivity rule)

Nitpicks:
- fallow.md status: 'v1 + v1.x partial shipped' → 'v1 + v1.x shipped; verdict deferred' (clarity)
- cmd-audit.test.ts: added --base= empty-value + --base "" two-token-empty-string parser tests for symmetry with --baseline coverage
SutuSebastian added a commit that referenced this pull request May 4, 2026
…age` table) (#56)

* docs(plan): static coverage ingestion (Istanbul JSON → `coverage` table)

Plans the C.11 candidate from `research/fallow.md` — `codemap ingest-coverage <path>`
reads Istanbul `coverage-final.json` into two new tables (`coverage` symbol-level +
`file_coverage` rollup), joinable to `symbols` for the killer "what's structurally
dead AND untested?" recipe in one query.

Resolves the open question from `fallow.md § 6` ("symbols column vs separate table?")
in favour of a separate table with `ON DELETE CASCADE` (D1) — coverage shape evolves
independently of structural columns; LEFT JOIN keeps NULL semantics explicit; rows
survive `--full` reindex via the `query_baselines` precedent (D6).

Key decisions:
- Istanbul JSON in v1; LCOV in v1.x; raw V8 traces never (D3, fallow's paid moat).
- One-shot `ingest-coverage` verb decoupled from `codemap` index runs (D4) — coverage
  cadence (per `bun test --coverage`) ≠ index cadence (per file edit).
- Statement coverage only in v1 (D5); branch/function deferred until a consumer asks.
- MCP/HTTP exposure as a query column, not a separate `coverage` tool (D9) — composes
  with every existing recipe + ad-hoc SQL.
- `codemap audit --delta coverage` deferred to v1.x (D10) — raw schema first.

Five-tracer plan: schema bump → engine → CLI verb → fixture + golden recipe → docs.
Plan only — implementation follows after CodeRabbit review per the established
workflow (PRs #46/47, #49/50, #51/52, #53/54).

* docs(plan): fact-check fixes — drop hallucinated SQL/projection/runner claims

Self-audit against the actual codebase surfaced four claims that didn't hold:

1. Killer recipe SQL referenced `callee_id` — `calls` is name-keyed
   (`callee_name TEXT`, no symbol-id FK; see `db.ts` `CallRow`). Rewrote
   the "no callers" predicate as `NOT EXISTS (… WHERE callee_name = s.name)`.
2. D7 claimed line-range projection is "the same `markers` already uses" —
   `markers` is line-pinned (`line_number INTEGER`), no projection.
   Reworded as "novel for this plan" with the actual mechanic spelled out.
3. D3 listed `bun test --coverage` as an Istanbul JSON emitter — `bun test
   --help` shows only `text` / `lcov` reporters today. Removed bun from the
   Istanbul-emitters list; left vitest/jest/c8/nyc with the explicit reporter
   flags they need.
4. D12 contradicted D6 ("rows absent until re-ingest" vs "rows survive
   `--full`"). Reconciled: empty is the correct initial state on first bump;
   subsequent bumps preserve via the `dropAll()` exclusion. Quoted the
   `lessons.md` policy verbatim instead of paraphrasing.

* docs(plan): v2 — fix CASCADE hazard + innermost-wins projection + nits

Self-grilling found two real schema design holes that would block execution:

1. **D6 CASCADE hazard.** Original draft keyed `coverage` on
   `symbol_id REFERENCES symbols(id) ON DELETE CASCADE`. Every `--full`
   reindex calls `dropAll()` → drops `symbols` → CASCADE wipes coverage,
   regardless of whether `coverage` itself was excluded from `dropAll()`.
   Recreated `symbols` get fresh auto-increment IDs anyway → coverage
   permanently lost without re-ingest. Fix: natural-key PK
   `(file_path, name, line_start)` — no FK to `symbols.id`. Survives the
   `symbols` drop-recreate cycle. Trade-off: orphan rows when files are
   deleted; cleaned by one explicit `DELETE FROM coverage WHERE file_path
   NOT IN (SELECT path FROM files)` after every ingest.

2. **D7 overlapping symbols.** Original draft: `line_start ≤ stmt_line ≤
   line_end` matches every enclosing scope. With nested symbols (class
   methods inside classes, closures inside functions), one Istanbul
   statement projects onto 3+ symbols, inflating `total_statements` 2-3×.
   Fix: innermost-wins via `(line_end - line_start) ASC LIMIT 1`. New
   `skipped.statements_no_symbol` counter for statements that fall outside
   every symbol range (top-level expressions, side-effect imports).

Nits cleared in the same pass:

- D2: drop `file_coverage` rollup table from v1 (aggregateable via
  GROUP BY on the symbol-level table; doubling sources of truth without
  a benchmark is premature). Promote to v1.x with a real query.
- D11: spec the `total_statements = 0 → coverage_pct IS NULL` edge case
  + document the cross-file name-collision lossiness in the killer recipe.
- Drop `--prune` flag (orphan cleanup is unconditional, no flag needed).
- Drop per-row `source` column (single meta key sufficient; one ingest
  at a time).
- Update killer recipe SQL to use the natural-key 3-column join.
- Drop made-up "~50 LoC LCOV ingester" estimate and "<50 ms / <1 ms /
  ~500 KB" performance numbers (no benchmark backed them).
- Tracer 1 / 2 / 3 acceptance criteria updated to match the new schema.

Plan is now ready for tracer-1 implementation. CodeRabbit pass deferred
(rate-limited 57m).

* docs(plan): tighten Bun-native API references (file read + perf note)

Plan correctly inherits the established Node vs Bun runtime split, but the
single tracer-3 reference understated it. Now:

- Tracer 3 cites `packaging.md § Node vs Bun` as the canonical pattern
  source instead of pointing at config.ts in passing.
- Performance section calls out the actual lever — `Bun.file(path).json()`
  uses Bun's native JSON parser, materially faster than V8 `JSON.parse`
  on multi-MB Istanbul payloads (real coverage files for medium codebases
  routinely hit several MB).

No new Bun-native API surfaces are added — the feature doesn't need
globbing, file writes, spawn, or hashing beyond what the existing engines
already use through their abstractions.

* docs(plan): v3 — ship LCOV in v1 + drop --source flag + bundle killer recipe

The "fully capable, no half-way APIs" principle reshapes three things:

1. **LCOV ingester ships in v1** alongside Istanbul. Original draft deferred
   LCOV to v1.x, which would exclude `bun test --coverage` users — i.e.
   codemap's own primary runtime. That's the textbook half-baked surface
   the principle bans. Two parser front-ends share one `upsertCoverageRows`
   core; LCOV is regex tokenizing over `SF:` / `DA:` / `end_of_record`.
   Tracer 2 splits into 2a (shared core + Istanbul parser) and 2b (LCOV
   parser), both writing identical normalised CoverageRow[] into the same
   upsert path.

2. **`--source istanbul|lcov` flag dropped.** Auto-detection from extension
   (`.json` → istanbul, `.info` → lcov, directory → probe both, error on
   ambiguous) is unambiguous; a flag for "tell codemap what it can already
   see" is API noise. Misnamed files can be renamed (one-liner) cheaper
   than codemap can grow a flag.

3. **Killer recipe ships as bundled `untested-and-dead.{sql,md}`** in
   `templates/recipes/`. Per the recipes-as-content registry (PR #37), the
   high-value queries become first-class agent surface. A buried doc
   snippet would be invisible to agents at session start; the bundled
   recipe shows up in `--recipes-json` and gets a `codemap query --recipe
   untested-and-dead` direct invocation.

Tracer 4 also fans out: Istanbul + LCOV fixtures cover the same partial
coverage shape; three golden recipes (`coverage-istanbul.json`,
`coverage-lcov.json`, `untested-and-dead.json`) prove format equivalence.
Out-of-scope, alternatives, performance section, title, and goal
statement all updated to match.

* docs(plan): v4 — agent-journey audit + bundled recipe shelf (D13)

Walked every D / OOS / tracer item against "fully capable + agent
first-class + no half-baked APIs". Found three half-baked surfaces:

1. **D2 deferral leaks "compose GROUP BY yourself" onto the agent.**
   Deferring the `file_coverage` table is correct (no benchmark proves
   it's needed) — but the agent-facing answer for "rank files by
   coverage" was missing. Fix: keep table deferral, ship a bundled
   `files-by-coverage.{sql,md}` recipe so the GROUP BY view IS
   first-class.

2. **D11 name-collision lossiness was acknowledged but unmitigated.**
   The killer recipe's `callee_name = s.name` cross-file lossiness
   was documented in the recipe SQL comment, but the recipe `.md`
   didn't give the agent any narrowing pattern. Now D11 ships three
   concrete narrowing patterns in the `.md` (file_path scope, default-
   export filter, exported-only restriction) so the agent has
   workable mitigations on day one.

3. **Missing recipe shelf for common agent questions.** Walking the
   journey: only "What's structurally dead AND untested?" had a recipe;
   "Rank files by coverage" and "Worst-covered exported symbols" forced
   ad-hoc SQL. Three recipes fully cover the agent journey end-to-end.

New D13 codifies the bundled-recipe principle: every common agent
question gets a `--recipe` verb. Three v1 recipes:
- `untested-and-dead.{sql,md}` (killer, with name-collision mitigations)
- `files-by-coverage.{sql,md}` (replaces D2's table deferral)
- `worst-covered-exports.{sql,md}` (top-N agent ask)

Each `.md` carries a frontmatter `actions` block (per PR #26) so agents
get per-row follow-up hints. All three appear in `--recipes-json`
automatically — agents discover them at session start.

New "Agent journey" section makes the principle visible: a table mapping
every common agent question to the v1 verb that answers it. If a row
ever shows "compose SQL yourself" without a recipe, the surface is
half-baked and needs a recipe before tracer 1 ships.

Tracer 4 expanded: ships all three recipes + five golden snapshots
(adds files-by-coverage.json + worst-covered-exports.json on top of the
three existing). Tracer 5 expanded: glossary + agent rule trigger
table gain three new rows.

Plan now passes the principle audit end-to-end.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant