Skip to content

docs(plan): .codemap/ directory consolidation — single root + self-managed .gitignore#53

Merged
SutuSebastian merged 5 commits intomainfrom
docs/plan-codemap-dir
May 3, 2026
Merged

docs(plan): .codemap/ directory consolidation — single root + self-managed .gitignore#53
SutuSebastian merged 5 commits intomainfrom
docs/plan-codemap-dir

Conversation

@SutuSebastian
Copy link
Copy Markdown
Contributor

@SutuSebastian SutuSebastian commented May 3, 2026

Summary

Plan for consolidating every codemap-managed path under a single .codemap/ directory with a tracked, self-managed .codemap/.gitignore that blacklists generated artifacts. Closes the per-feature agents-init.ts .gitignore patching surface (every new cache today — audit-cache in #52 — requires a user-facing .gitignore line; after this lands, new caches just bump the blacklist on next codemap boot).

Mirrors the flowbite-react pattern (.flowbite-react/.gitignore lists generated artifacts explicitly; everything else tracked by default).

Why now

Sketched layout

<projectRoot>/
├── .codemap/
│   ├── .gitignore              ← tracked, codemap-managed (idempotent rewrite on every boot)
│   ├── recipes/                ← tracked (user-authored SQL; existing)
│   ├── index.db                ← was .codemap.db; ignored
│   ├── index.db-wal            ← ignored
│   ├── index.db-shm            ← ignored
│   └── audit-cache/            ← ignored (per #52)
└── (root .gitignore — codemap entries no longer needed)

.codemap/.gitignore:

index.db
index.db-wal
index.db-shm
audit-cache/

See docs/plans/codemap-dir-consolidation.md for the full design — 10 decisions (D1-D10), 5 tracers, perf considerations, alternatives rejected (whitelist, root-level marker block, no migration), and out-of-scope list.

Test plan

  • CodeRabbit reviews D1 (blacklist vs whitelist), D2 (atomic rename migration), D3 (one-cycle deprecation), D4 (agents init cleanup behavior — leave existing root entries alone), D5 (regenerate .codemap/.gitignore on every boot, not just agents init)
  • Once approved + merged, follow-up PR implements Tracers 1-5

Summary by CodeRabbit

  • Documentation
    • Added design plan for consolidating codemap-managed assets into a single state directory, including configuration relocation and self-managed .gitignore handling.
    • Updated project roadmap with consolidation initiative reference.

…naged .codemap/.gitignore

Collapse the dual-pattern surface (.codemap.db at root + .codemap/<thing>/ for caches) to a single .codemap/ dir with a tracked .codemap/.gitignore that blacklists generated artifacts. Mirrors flowbite-react's pattern (.flowbite-react/.gitignore lists class-list.json + pid; everything else tracked by default).

Closes the per-feature agents-init.ts .gitignore patching surface — every new cache today (audit-cache shipped in #52) requires a user-facing .gitignore line; after this lands, new caches just bump the blacklist in .codemap/.gitignore on next codemap boot.

DB path migration: .codemap.db → .codemap/index.db via atomic rename on first run; shim ships in v1.x, removed in v2. Recipes stay at .codemap/recipes/ (user-authored source, not generated).

Plan only — implementation follows after CodeRabbit review per the established workflow.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 3, 2026

⚠️ No Changeset found

Latest commit: b84a16d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

Warning

Rate limit exceeded

@SutuSebastian has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 52 minutes and 26 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cb6d471d-1683-4436-b9c9-755145861c41

📥 Commits

Reviewing files that changed from the base of the PR and between 01d4da3 and b84a16d.

📒 Files selected for processing (1)
  • docs/plans/codemap-dir-consolidation.md
📝 Walkthrough

Walkthrough

Adds a design plan document specifying consolidation of codemap-managed persisted assets under a single .codemap/ directory (default, configurable via --state-dir), with self-managed .gitignore and relocated config files. Updates the roadmap to reference this plan.

Changes

Documentation & Planning

Layer / File(s) Summary
Design Plan
docs/plans/codemap-dir-consolidation.md
New plan document outlining single state directory consolidation, self-managed gitignore strategy, config relocation to <state-dir>/config.{ts,js,json}, idempotent reconcilers, and no migration shims.
Roadmap Update
docs/roadmap.md
Backlog item added linking to the consolidation plan with summary of target directory layout and migration approach.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Suggested labels

documentation, enhancement

Poem

🐰 A burrow of files, once scattered and wide,
Now nestles together in one cozy hide,
The .codemap home grows, so tidy, so neat,
With gitignore's care—a self-managed treat!
Where config and database dance, side by side.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the primary change: a design plan for consolidating the .codemap/ directory into a single root with a self-managed .gitignore, which aligns perfectly with the documentation changes (adding codemap-dir-consolidation.md plan and roadmap entry).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/plan-codemap-dir

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 52 minutes and 26 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

…figurable, move config into <state-dir>

Per user feedback:
1. Pre-v1 → no migration shim needed (D2 simplified to 'rm .codemap.db once'); existing two dev clones each pay one cleanup.
2. State directory is configurable: --state-dir CLI / CODEMAP_STATE_DIR env, default .codemap/ (D7).
3. Config file moves from <root>/codemap.config.{ts,json} → <state-dir>/config.{ts,js,json} (D8) — chicken-and-egg avoided by resolving state-dir from CLI/env only, not from config content.
4. flowbite-react pattern endorsed (D1 unchanged) — blacklist generated artifacts; tracked sources default visible.
5. Self-managed .gitignore caveat (less discoverable than root .gitignore) re-evaluated as a non-problem — git check-ignore + universal nested-gitignore convention cover it.

Tracers reduced from 5 to 5 (re-shaped): state-dir resolver → config loader move → state-gitignore writer → agents-init updates → docs+changeset+plan deletion. No migration tracer.
…ot flowbite-react clone)

Adopts the IDEA from flowbite-react (every config file owned by an idempotent setup function; setup logic IS the migration) but expressed in codemap's own architectural vocabulary:

- ensure* naming (matches existing ensureGitignoreCodemapPattern)
- Engine layer in src/application/state-dir.ts (not flowbite's commands/setup-*.ts)
- Single Zod schema as both runtime validator and TS source via z.infer (codemap already uses Zod in tool-handlers.ts)
- .gitignore reconciler enforces canonical verbatim with a 'codemap-managed — edits will be overwritten' header (no marker block, no merge logic)
- TS configs validated-only (don't rewrite user code); JSON configs self-heal across versions

Tracer 2 reshaped to expose pure { before, after, written } return values for testability — side effects at the edge per codemap's existing layering. Bootstrap orchestrator fans out to ensureStateGitignore + ensureStateConfig (and any future ensure*) so adding a new self-healing file is one-line registration.
@SutuSebastian
Copy link
Copy Markdown
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
docs/plans/codemap-dir-consolidation.md (2)

68-68: ⚖️ Poor tradeoff

Consider splitting D11 for readability.

D11 packs multiple concepts (self-healing principle, Zod validation, .gitignore enforcement, upgrade UX) into a single table cell. Readers may miss key details in the dense prose.

Options:

  • Break D11 into D11a (self-healing principle), D11b (Zod schema), D11c (.gitignore ownership)
  • Move implementation details (Zod, ensure* naming) to the Tracers section and keep D11 focused on the high-level principle
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/codemap-dir-consolidation.md` at line 68, Split the dense D11 cell
into clearer parts: create D11a describing the self-healing principle and
runtime flow (the ensure<Thing> engines run on boot: read → validate → reconcile
→ write-on-drift → log), create D11b to call out the Zod-based schema/validation
(reference Zod and tool-handlers.ts and z.infer for runtime/TS reuse), and
create D11c to state `.gitignore` ownership/behaviour (reference
ensureGitignoreCodemapPattern and the header line `# codemap-managed — edits
will be overwritten`); alternatively move the implementation specifics (ensure*
naming, Zod, tool-handlers.ts) into the Tracers/implementation section and keep
D11 as a concise high-level principle.

60-60: ⚡ Quick win

Adjust tense to match design-phase status.

D2 states "this PR moves" but line 3 notes this document is "in design (no code)". Use future tense for consistency.

📝 Suggested revision
-| D2  | **No migration shim.** Codemap is pre-v1; this PR moves `.codemap.db` → `<state-dir>/index.db` cleanly with no compat code. Existing dev clones run `rm .codemap.db` once and re-index. Same for `<root>/codemap.config.{ts,json}` → `<state-dir>/config.{ts,json}`. Changelog notes the one-line cleanup.
+| D2  | **No migration shim.** Codemap is pre-v1; implementation will move `.codemap.db` → `<state-dir>/index.db` cleanly with no compat code. Existing dev clones run `rm .codemap.db` once and re-index. Same for `<root>/codemap.config.{ts,json}` → `<state-dir>/config.{ts,json}`. Changelog notes the one-line cleanup.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/codemap-dir-consolidation.md` at line 60, Change the phrasing in
the design doc to future tense to reflect "design (no code)" status: update
entries like "agents init writes both the nested `.gitignore` and a root entry"
to "agents init will write..." or "codemap agents init will write/update
`<state-dir>/.gitignore` and will ensure root `.gitignore` contains
`<state-dir>/` once", and similarly change "this PR moves" in D2 to "this
proposal will move" (or equivalent future-tense wording) so all statements
(e.g., D2, D3, and any lines referencing `agents init`, `codemap agents init`,
`<state-dir>/.gitignore`, and root `.gitignore`) consistently use future tense.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/plans/codemap-dir-consolidation.md`:
- Line 91: Replace the incorrectly escaped glob in the table cell: find the
string "`<state-dir>/.gitignore` (`\* + !recipes/**`)" and remove the backslash
escaping so the pattern reads "`<state-dir>/.gitignore` (`* + !recipes/**`)`"
(or ensure both the path and pattern are in inline code spans). This fixes the
markdown rendering of the asterisks without changing the wording.
- Line 31: The "(D12)" decision reference after "config.ts" is invalid because
the Decisions table only defines D1–D11; update the parenthetical on the
"config.ts" line to reference the correct decision ID (e.g., replace "(D12)"
with the actual decision ID from the Decisions table) or remove the
parenthetical entirely and, if appropriate, add a correct link/reference to the
matching decision entry so the config.ts entry points to an existing decision.

---

Nitpick comments:
In `@docs/plans/codemap-dir-consolidation.md`:
- Line 68: Split the dense D11 cell into clearer parts: create D11a describing
the self-healing principle and runtime flow (the ensure<Thing> engines run on
boot: read → validate → reconcile → write-on-drift → log), create D11b to call
out the Zod-based schema/validation (reference Zod and tool-handlers.ts and
z.infer for runtime/TS reuse), and create D11c to state `.gitignore`
ownership/behaviour (reference ensureGitignoreCodemapPattern and the header line
`# codemap-managed — edits will be overwritten`); alternatively move the
implementation specifics (ensure* naming, Zod, tool-handlers.ts) into the
Tracers/implementation section and keep D11 as a concise high-level principle.
- Line 60: Change the phrasing in the design doc to future tense to reflect
"design (no code)" status: update entries like "agents init writes both the
nested `.gitignore` and a root entry" to "agents init will write..." or "codemap
agents init will write/update `<state-dir>/.gitignore` and will ensure root
`.gitignore` contains `<state-dir>/` once", and similarly change "this PR moves"
in D2 to "this proposal will move" (or equivalent future-tense wording) so all
statements (e.g., D2, D3, and any lines referencing `agents init`, `codemap
agents init`, `<state-dir>/.gitignore`, and root `.gitignore`) consistently use
future tense.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9357c927-9d91-462b-b9f5-44739f1ffe64

📥 Commits

Reviewing files that changed from the base of the PR and between fe5a355 and 01d4da3.

📒 Files selected for processing (2)
  • docs/plans/codemap-dir-consolidation.md
  • docs/roadmap.md

Comment thread docs/plans/codemap-dir-consolidation.md Outdated
Comment thread docs/plans/codemap-dir-consolidation.md Outdated
…tense

- Thread 1: layout sketch comment '(D12)' → '(D8)' (only D1-D11 exist)
- Thread 2: alternatives table inline-code escaping fixed (backslashes inside backticks render literally; not needed)
- Nitpick 2: D2 tense aligned with 'design — no code' status ('this PR moves' → 'the implementation will move')
- Nitpick 1 (split D11) declined: D11 is intentionally an integrated story for the self-healing principle; splitting fragments the narrative
@SutuSebastian
Copy link
Copy Markdown
Contributor Author

Triaged the 2 inline threads + 2 nitpicks per `pr-comment-fact-check`. 3 of 4 actionable; 1 declined.

Applied (3):

  • ✅ Thread 1 — '(D12)' → '(D8)' in layout sketch (only D1-D11 exist).
  • ✅ Thread 2 — alternatives table escaping (asterisks inside backtick spans render literally; backslashes were redundant noise from a previous formatter pass).
  • ✅ Nitpick 2 — D2 tense: 'this PR moves' → 'the implementation will move' (matches 'design — no code' status).

Declined (1):

  • 💭 Nitpick 1 — split D11 into D11a/D11b/D11c. D11 is intentionally an integrated story for the self-healing principle (the connection between 'setup IS the migration', Zod-as-single-source, and .gitignore-overwrite-semantics is the whole point — splitting fragments the narrative). Decision-table cells are allowed to be dense when the ideas reinforce each other; readers needing the granular breakdown get it via Tracers 2 + 3 + the linked flowbite-react reference. Per `concise-comments`: "extract to a docs/ file and link to it" applies for >3-line prose blocks; D11 fits in one cell intentionally.

Plan ready for re-review or merge.

@SutuSebastian SutuSebastian merged commit b62f308 into main May 3, 2026
10 checks passed
@SutuSebastian SutuSebastian deleted the docs/plan-codemap-dir branch May 3, 2026 15:19
SutuSebastian added a commit that referenced this pull request May 3, 2026
…es (impl of PR #53 plan) (#54)

* feat(state-dir): resolveStateDir + DB at <state-dir>/index.db (Tracer 1)

- application/state-dir.ts: resolveStateDir({root, cliFlag, env}) per plan §D7. Constants STATE_DIR_DEFAULT='.codemap', STATE_DB_NAME='index.db'. 12 unit tests cover precedence + relative/absolute paths.
- config.ts: ResolvedCodemapConfig gains stateDir; resolveCodemapConfig 3rd arg opts.stateDir; databasePath defaults to <stateDir>/index.db. User-supplied databasePath wins (escape hatch).
- config.ts: loadUserConfig reads <state-dir>/config.{ts,js,json} (D8); legacy <root>/codemap.config.* dropped (pre-v1).
- runtime.ts: getStateDir() getter.
- bootstrap.ts: --state-dir + CODEMAP_STATE_DIR; precedence per D7.
- bootstrap-codemap.ts (new): single helper extracts the loadUserConfig+resolveCodemapConfig+initCodemap+configureResolver dance from 9 cmd-* files. Tracer 4's ensureStateDir attaches here.
- All 9 cmd-* files refactored; stateDir threaded through interfaces + main.ts dispatch + ServerOpts (mcp/serve).
- audit-worktree.ts: cached entries at <sha>/.codemap/index.db (recursive layout — each cached worktree is its own self-contained codemap project).
- audit-engine.ts: makeWorktreeReindex stops hard-coding db path; openCodemapDatabase() reads from initialised runtime.
- sqlite-db.ts: openCodemapDatabase mkdirs the parent (state-dir may not exist on fresh project).

Dogfood:
- .codemap/.gitignore (self-managed, blacklist) — codemap repo + fixtures/minimal/
- root .gitignore: dropped .codemap.* and .codemap/audit-cache/ (nested .gitignore handles them)
- fixtures/minimal/.codemap.db* removed (stale legacy)

703 tests pass.

* feat(state-dir): ensureStateGitignore reconciler — self-healing (Tracer 2)

- STATE_GITIGNORE_BODY constant — single source of truth for the canonical blacklist.
- ensureStateGitignore(stateDir) — pure-shape return ({before, after, written}); idempotent (no write on steady state); auto-mkdir; user-edits rewritten back per D11 (file is codemap-managed; header line declares it).
- 5 tests cover: fresh write, idempotent, user-modified, older-version self-heal, returned shape matches disk.

Bumping STATE_GITIGNORE_BODY in a future PR is the entire migration — every consumer's project repairs itself on next codemap run.

* feat(state-dir): ensureStateConfig reconciler (Tracer 3)

* feat(state-dir): bootstrap orchestrator + agents-init delegation (Tracer 4)

* docs(state-dir): sync README + architecture + glossary + agent rule/skill + changeset (Tracer 5)

- README.md: --state-dir flag, config-file location, self-healing callout
- docs/architecture.md: state-dir resolver in src/config.ts intro; bulk .codemap.db → .codemap/index.db; User config section rewritten with self-healing details (ensureStateGitignore + ensureStateConfig from src/application/, single attachment point in cli/bootstrap-codemap.ts)
- docs/glossary.md: new entries for '.codemap/' / <state-dir> / CODEMAP_STATE_DIR; '.codemap/index.db'; '.codemap/.gitignore' / self-healing files
- docs/roadmap.md: drop the consolidated-dir backlog item (it shipped)
- docs/research/fallow.md: Adjacent-shipped bullet referencing PR #53 (plan) + #54 (impl)
- .agents/rules/codemap.md + templates/agents/rules/codemap.md: bulk .codemap.db → .codemap/index.db; one-paragraph addition explaining state-dir + self-healing .gitignore (Rule 10 lockstep)
- .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md: bulk path update (Rule 10 lockstep)
- .changeset/codemap-dir-consolidation.md: minor — full design rationale + cleanup steps
- docs/plans/codemap-dir-consolidation.md: deleted per docs-governance Rule 3

* docs(state-dir): refresh stale path refs across docs/ + slim self-authored comments

Doc staleness sweep (after Tracer 5):
- docs/glossary.md, docs/agents.md, docs/benchmark.md, docs/why-codemap.md, docs/research/competitive-scan-2026-04.md, docs/research/fallow.md (B.6 row): bulk `.codemap.db` → `.codemap/index.db` everywhere except the intentional 'old → new' migration callouts.
- docs/architecture.md, docs/research/fallow.md, docs/packaging.md: `codemap.config.{ts,json}` → `<state-dir>/config.{ts,js,json}`.
- docs/agents.md § Git: rewritten to describe the self-managed <state-dir>/.gitignore reconciler instead of root-.gitignore patching.
- docs/benchmark.md: 'where the DB lives' updated; manual .gitignore note dropped (reconciler handles it).

Concise-comments sweep on this turn's authored comments:
- src/application/state-config.ts: 2 inline comments slimmed (TS/JS-validation-only and passthrough-rationale).

* fix(state-dir): address CodeRabbit findings (1 inline + 4 outside-diff + 2 nitpicks)

Real bug:
- main.ts: runListBaselinesCmd was being called without stateDir — `codemap query --baselines` would fall back to the default DB instead of the caller-selected one. Fixed.

Stale doc refs:
- audit-worktree.ts: 3 JSDoc strings still said `.codemap.db` after Tracer 1's CACHE_ENTRY_DB_REL move; bumped to `.codemap/index.db`.
- bootstrap.ts: printCliUsage() had two `.codemap.db` refs + missing --state-dir/CODEMAP_STATE_DIR docs in Environment+Options. Fixed.
- config.ts: Zod databasePath.describe() said default was `<root>/.codemap.db`; corrected to `<state-dir>/index.db`.
- .agents/skills + templates skills: 2 hard-coded `.codemap/` refs reworded to `<state-dir>/` with `(default .codemap/)` callout (state-dir is configurable).

Nitpicks applied:
- state-dir.test.ts: dropped redundant `require('node:fs')` for mkdirSync (already imported).
- bootstrap-codemap.ts: consolidated two single-import lines from state-dir into one statement.

Nitpicks declined:
- changeset code-fence missing 'text' lang — purely cosmetic.
- cmd-index.ts JSDoc on runIndexCmd — 'all public APIs need JSDoc' is a fabricated rule (sibling cmds inconsistent; same hallucination rejected on PR #50).
SutuSebastian added a commit that referenced this pull request May 4, 2026
…age` table) (#56)

* docs(plan): static coverage ingestion (Istanbul JSON → `coverage` table)

Plans the C.11 candidate from `research/fallow.md` — `codemap ingest-coverage <path>`
reads Istanbul `coverage-final.json` into two new tables (`coverage` symbol-level +
`file_coverage` rollup), joinable to `symbols` for the killer "what's structurally
dead AND untested?" recipe in one query.

Resolves the open question from `fallow.md § 6` ("symbols column vs separate table?")
in favour of a separate table with `ON DELETE CASCADE` (D1) — coverage shape evolves
independently of structural columns; LEFT JOIN keeps NULL semantics explicit; rows
survive `--full` reindex via the `query_baselines` precedent (D6).

Key decisions:
- Istanbul JSON in v1; LCOV in v1.x; raw V8 traces never (D3, fallow's paid moat).
- One-shot `ingest-coverage` verb decoupled from `codemap` index runs (D4) — coverage
  cadence (per `bun test --coverage`) ≠ index cadence (per file edit).
- Statement coverage only in v1 (D5); branch/function deferred until a consumer asks.
- MCP/HTTP exposure as a query column, not a separate `coverage` tool (D9) — composes
  with every existing recipe + ad-hoc SQL.
- `codemap audit --delta coverage` deferred to v1.x (D10) — raw schema first.

Five-tracer plan: schema bump → engine → CLI verb → fixture + golden recipe → docs.
Plan only — implementation follows after CodeRabbit review per the established
workflow (PRs #46/47, #49/50, #51/52, #53/54).

* docs(plan): fact-check fixes — drop hallucinated SQL/projection/runner claims

Self-audit against the actual codebase surfaced four claims that didn't hold:

1. Killer recipe SQL referenced `callee_id` — `calls` is name-keyed
   (`callee_name TEXT`, no symbol-id FK; see `db.ts` `CallRow`). Rewrote
   the "no callers" predicate as `NOT EXISTS (… WHERE callee_name = s.name)`.
2. D7 claimed line-range projection is "the same `markers` already uses" —
   `markers` is line-pinned (`line_number INTEGER`), no projection.
   Reworded as "novel for this plan" with the actual mechanic spelled out.
3. D3 listed `bun test --coverage` as an Istanbul JSON emitter — `bun test
   --help` shows only `text` / `lcov` reporters today. Removed bun from the
   Istanbul-emitters list; left vitest/jest/c8/nyc with the explicit reporter
   flags they need.
4. D12 contradicted D6 ("rows absent until re-ingest" vs "rows survive
   `--full`"). Reconciled: empty is the correct initial state on first bump;
   subsequent bumps preserve via the `dropAll()` exclusion. Quoted the
   `lessons.md` policy verbatim instead of paraphrasing.

* docs(plan): v2 — fix CASCADE hazard + innermost-wins projection + nits

Self-grilling found two real schema design holes that would block execution:

1. **D6 CASCADE hazard.** Original draft keyed `coverage` on
   `symbol_id REFERENCES symbols(id) ON DELETE CASCADE`. Every `--full`
   reindex calls `dropAll()` → drops `symbols` → CASCADE wipes coverage,
   regardless of whether `coverage` itself was excluded from `dropAll()`.
   Recreated `symbols` get fresh auto-increment IDs anyway → coverage
   permanently lost without re-ingest. Fix: natural-key PK
   `(file_path, name, line_start)` — no FK to `symbols.id`. Survives the
   `symbols` drop-recreate cycle. Trade-off: orphan rows when files are
   deleted; cleaned by one explicit `DELETE FROM coverage WHERE file_path
   NOT IN (SELECT path FROM files)` after every ingest.

2. **D7 overlapping symbols.** Original draft: `line_start ≤ stmt_line ≤
   line_end` matches every enclosing scope. With nested symbols (class
   methods inside classes, closures inside functions), one Istanbul
   statement projects onto 3+ symbols, inflating `total_statements` 2-3×.
   Fix: innermost-wins via `(line_end - line_start) ASC LIMIT 1`. New
   `skipped.statements_no_symbol` counter for statements that fall outside
   every symbol range (top-level expressions, side-effect imports).

Nits cleared in the same pass:

- D2: drop `file_coverage` rollup table from v1 (aggregateable via
  GROUP BY on the symbol-level table; doubling sources of truth without
  a benchmark is premature). Promote to v1.x with a real query.
- D11: spec the `total_statements = 0 → coverage_pct IS NULL` edge case
  + document the cross-file name-collision lossiness in the killer recipe.
- Drop `--prune` flag (orphan cleanup is unconditional, no flag needed).
- Drop per-row `source` column (single meta key sufficient; one ingest
  at a time).
- Update killer recipe SQL to use the natural-key 3-column join.
- Drop made-up "~50 LoC LCOV ingester" estimate and "<50 ms / <1 ms /
  ~500 KB" performance numbers (no benchmark backed them).
- Tracer 1 / 2 / 3 acceptance criteria updated to match the new schema.

Plan is now ready for tracer-1 implementation. CodeRabbit pass deferred
(rate-limited 57m).

* docs(plan): tighten Bun-native API references (file read + perf note)

Plan correctly inherits the established Node vs Bun runtime split, but the
single tracer-3 reference understated it. Now:

- Tracer 3 cites `packaging.md § Node vs Bun` as the canonical pattern
  source instead of pointing at config.ts in passing.
- Performance section calls out the actual lever — `Bun.file(path).json()`
  uses Bun's native JSON parser, materially faster than V8 `JSON.parse`
  on multi-MB Istanbul payloads (real coverage files for medium codebases
  routinely hit several MB).

No new Bun-native API surfaces are added — the feature doesn't need
globbing, file writes, spawn, or hashing beyond what the existing engines
already use through their abstractions.

* docs(plan): v3 — ship LCOV in v1 + drop --source flag + bundle killer recipe

The "fully capable, no half-way APIs" principle reshapes three things:

1. **LCOV ingester ships in v1** alongside Istanbul. Original draft deferred
   LCOV to v1.x, which would exclude `bun test --coverage` users — i.e.
   codemap's own primary runtime. That's the textbook half-baked surface
   the principle bans. Two parser front-ends share one `upsertCoverageRows`
   core; LCOV is regex tokenizing over `SF:` / `DA:` / `end_of_record`.
   Tracer 2 splits into 2a (shared core + Istanbul parser) and 2b (LCOV
   parser), both writing identical normalised CoverageRow[] into the same
   upsert path.

2. **`--source istanbul|lcov` flag dropped.** Auto-detection from extension
   (`.json` → istanbul, `.info` → lcov, directory → probe both, error on
   ambiguous) is unambiguous; a flag for "tell codemap what it can already
   see" is API noise. Misnamed files can be renamed (one-liner) cheaper
   than codemap can grow a flag.

3. **Killer recipe ships as bundled `untested-and-dead.{sql,md}`** in
   `templates/recipes/`. Per the recipes-as-content registry (PR #37), the
   high-value queries become first-class agent surface. A buried doc
   snippet would be invisible to agents at session start; the bundled
   recipe shows up in `--recipes-json` and gets a `codemap query --recipe
   untested-and-dead` direct invocation.

Tracer 4 also fans out: Istanbul + LCOV fixtures cover the same partial
coverage shape; three golden recipes (`coverage-istanbul.json`,
`coverage-lcov.json`, `untested-and-dead.json`) prove format equivalence.
Out-of-scope, alternatives, performance section, title, and goal
statement all updated to match.

* docs(plan): v4 — agent-journey audit + bundled recipe shelf (D13)

Walked every D / OOS / tracer item against "fully capable + agent
first-class + no half-baked APIs". Found three half-baked surfaces:

1. **D2 deferral leaks "compose GROUP BY yourself" onto the agent.**
   Deferring the `file_coverage` table is correct (no benchmark proves
   it's needed) — but the agent-facing answer for "rank files by
   coverage" was missing. Fix: keep table deferral, ship a bundled
   `files-by-coverage.{sql,md}` recipe so the GROUP BY view IS
   first-class.

2. **D11 name-collision lossiness was acknowledged but unmitigated.**
   The killer recipe's `callee_name = s.name` cross-file lossiness
   was documented in the recipe SQL comment, but the recipe `.md`
   didn't give the agent any narrowing pattern. Now D11 ships three
   concrete narrowing patterns in the `.md` (file_path scope, default-
   export filter, exported-only restriction) so the agent has
   workable mitigations on day one.

3. **Missing recipe shelf for common agent questions.** Walking the
   journey: only "What's structurally dead AND untested?" had a recipe;
   "Rank files by coverage" and "Worst-covered exported symbols" forced
   ad-hoc SQL. Three recipes fully cover the agent journey end-to-end.

New D13 codifies the bundled-recipe principle: every common agent
question gets a `--recipe` verb. Three v1 recipes:
- `untested-and-dead.{sql,md}` (killer, with name-collision mitigations)
- `files-by-coverage.{sql,md}` (replaces D2's table deferral)
- `worst-covered-exports.{sql,md}` (top-N agent ask)

Each `.md` carries a frontmatter `actions` block (per PR #26) so agents
get per-row follow-up hints. All three appear in `--recipes-json`
automatically — agents discover them at session start.

New "Agent journey" section makes the principle visible: a table mapping
every common agent question to the v1 verb that answers it. If a row
ever shows "compose SQL yourself" without a recipe, the surface is
half-baked and needs a recipe before tracer 1 ships.

Tracer 4 expanded: ships all three recipes + five golden snapshots
(adds files-by-coverage.json + worst-covered-exports.json on top of the
three existing). Tracer 5 expanded: glossary + agent rule trigger
table gain three new rows.

Plan now passes the principle audit end-to-end.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant