docs(plan): codemap watch — file-watcher library audit + decision (chokidar v5) by SutuSebastian · Pull Request #46 · stainless-code/codemap

SutuSebastian · 2026-05-02T18:00:25Z

Summary

Plan-only PR for the next backlog item: codemap watch (the highest-impact remaining feature for AI-agent UX — eliminates the "is the index stale?" friction every CLI / MCP / HTTP query rides on today).

Per docs-governance Rule 3, plans live at `docs/plans/.md` and get deleted on ship. This PR adds the plan; CodeRabbit gets a review pass; impl follows in a separate PR once decisions are settled.

What's in the plan

Agent-experience win matrix — today vs with watch mode (today every query past the first edit returns stale rows; with watch mode the index streams alongside the session).
Library evaluation — 6 watchers audited (chokidar, @parcel/watcher, nsfw, watchpack, turbowatch, bare `node:fs.watch`) on speed, robustness, OS coverage (macOS / Linux / Windows / WSL), JS-runtime coverage (Bun + Node), and install footprint.
Decision: chokidar v5. Wins on cross-runtime parity (pure JS — no Bun N-API quirks), 14-year bug-hunt history (~30M repos), smallest install footprint of the JS abstractions, atomic-write + chunked-write handling out of the box, ESM-only Node ≥20.19 alignment with our stack. `@parcel/watcher` kept as a defensible alternative if we ever measure chokidar perf as a bottleneck on >100k-file monorepos.
Sketched CLI surface — `codemap watch` standalone + `codemap serve --watch` / `codemap mcp --watch` killer combos + `CODEMAP_WATCH=1` env shortcut.
5-tracer build plan with acceptance criteria.
Out of scope — polling fallback (defer to consumer ask), daemon detach (rejected by non-goals), SSE push, multi-host.

Behavior change

None — docs only. Implementation lands in a separate PR after this plan is reviewed + merged.

Test plan

`bun run check` green.
CodeRabbit review pass.
CI green.

Why ship the plan separately

Per the existing pattern (PR #43 SARIF, PR #44 serve), plans get a review pass before code lands. Three reasons:

CodeRabbit catches "I considered X but didn't" gaps in the library audit.
The chokidar-vs-@parcel/watcher tradeoff is reversible at impl time but not free — getting it wrong costs a refactor PR.
The CLI surface (`--watch` flag on serve / mcp + standalone verb) is a public-API decision worth a settle pass before tracers begin.

Summary by CodeRabbit

Documentation
- Added documentation plan for upcoming watch-mode feature, which will enable real-time indexing of code changes to keep database current during long-lived sessions, reducing stale data reads.

…okidar v5) 6 watchers audited (chokidar, @parcel/watcher, nsfw, watchpack, turbowatch, node:fs.watch) on speed, robustness, OS coverage (macOS/Linux/Windows/WSL), JS runtime coverage (Bun + Node), maintenance, and install footprint. Decision: chokidar v5. Wins on cross-runtime parity (pure JS — no N-API quirks on Bun), 14-year bug-hunt history (~30M repos using it), smallest install footprint of the JS abstractions (82 KB + 1 dep), atomic-write + chunked-write handling out of the box, and ESM-only Node ≥20.19 alignment with codemap's stack. @parcel/watcher kept as a defensible alternative if we ever measure chokidar perf as a bottleneck on >100k-file monorepos. Plan also captures: agent-experience win matrix (today vs with watch mode — eliminates the 'is the index stale?' friction that every CLI/MCP/HTTP query rides on); sketched CLI surface (codemap watch standalone + codemap serve --watch / codemap mcp --watch killer combos + CODEMAP_WATCH=1 env shortcut); 5-tracer build plan with acceptance criteria; explicit out-of-scope (polling fallback, daemon detach, SSE push, multi-host). Per docs-governance Rule 3: plan lives in docs/plans/watch-mode.md, deleted on ship.

changeset-bot · 2026-05-02T18:00:28Z

⚠️ No Changeset found

Latest commit: 0005ea8

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-05-02T18:00:32Z

Warning

Rate limit exceeded

@SutuSebastian has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 55 minutes and 32 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0c074098-8905-403b-ad34-786487ee1978

📥 Commits

Reviewing files that changed from the base of the PR and between 813099a and 0005ea8.

📒 Files selected for processing (1)

docs/plans/watch-mode.md

📝 Walkthrough

Walkthrough

A design document for codemap watch is added, defining watch-mode behavior to maintain .codemap.db currency after file edits. The plan specifies library selection (chokidar v5), CLI/API surface, environment variables, watched paths, event-to-reindex flow, implementation seams, and acceptance criteria.

Changes

Watch-Mode Planning Document

Layer / File(s)	Summary
Design Plan `docs/plans/watch-mode.md`	Complete design specification for watch-mode feature, including goals, agent-experience improvements, library evaluation matrix, API/CLI surface (`codemap watch`, `serve --watch`, `mcp --watch`), environment variables (`CODEMAP_WATCH`, debounce tuning), watched paths/globs, event handling pipeline (debounce/coalesce/filter/reindex), implementation seams, tracer plan with acceptance criteria, and out-of-scope items (polling fallback, daemonization, cross-host watching, push notifications).

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Poem

🐰 A watcher awakens, no stale reads to mourn,
Chokidar brings files—refreshed each morn!
Watch-mode's designed with debounce and care,
The .codemap.db stays crisp and fair. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: a documentation plan for codemap watch mode that includes a file-watcher library audit and the decision to use chokidar v5.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/plan-watch-mode

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 55 minutes and 32 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

SutuSebastian · 2026-05-02T18:04:21Z

@coderabbitai review

coderabbitai · 2026-05-02T18:04:25Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/plans/watch-mode.md`:
- Around line 94-100: The fenced code block showing the CLI examples (lines
containing "codemap watch", "codemap serve --watch", and "codemap mcp  
--watch") is missing a language specifier; update the opening fence to include a
language (e.g., bash) so syntax highlighting and linters pick it up, ensuring
the rest of the block content remains unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b285a8ad-dfbb-4c13-b36b-c992529ee51d

📥 Commits

Reviewing files that changed from the base of the PR and between 35fd17b and 813099a.

📒 Files selected for processing (1)

docs/plans/watch-mode.md

…040) Single CodeRabbit comment on PR #46 — markdownlint MD040 caught a bare fence on the CLI surface example. Quick win.

…combo (impl of PR #46 plan) (#47) * feat(watch): application/watcher.ts skeleton — pure debouncer + path filter + chokidar backend (Tracer 1 of 5) Per docs/plans/watch-mode.md the engine is split into pure helpers + an injectable backend so tests don't need real fs watches: - shouldIndexPath(relPath, excludeDirNames) — pure predicate. Same indexed extensions as the indexer (.ts/.tsx/.mts/.cts/.js/.jsx/.mjs/.cjs/.css) plus project-local recipes (<root>/.codemap/recipes/<id>.{sql,md}). Hand-rolled path-segment scan (no .split('/') alloc per call — watcher fires on every unrelated edit). - createDebouncer(onFlush, delayMs) — sliding-window timer; trigger() resets, flushNow() forces, reset() clears without firing. Coalesces a burst (git checkout, npm install) into one onChange call. - WatchBackend interface — production wires chokidar v5 (atomic + awaitWriteFinish for chunked-write detection); tests inject a fake backend that drives events deterministically (no flake in CI containers). - runWatchLoop({root, excludeDirNames, onChange, debounceMs?, backend?}) — wires the three together. Returns {stop} that drains the debouncer + closes the backend (so SIGINT loses no events). - DEFAULT_DEBOUNCE_MS = 250 (long enough to coalesce one editor save burst, short enough that agents don't perceive lag). 18 unit tests cover: extension whitelist, exclude-dir-names exact match (not substring — 'distill' is fine even when 'dist' is excluded), recipe paths, .codemap/codemap.db rejected, debouncer sliding window + flushNow + reset, backend dispatch, abs→rel POSIX conversion, dedup within burst, stop flushes pending. chokidar v5 added (1 dep, 82 KB) per docs/plans/watch-mode.md decision. Tracer 2 wires cmd-watch.ts; Tracer 3 wires --watch into serve / mcp. * feat(watch): cmd-watch.ts CLI verb + main/bootstrap wiring (Tracer 2 of 5) Adds the standalone 'codemap watch' command. Wires the engine from Tracer 1: - parseWatchRest() with --debounce <ms> + --quiet flag (12 unit tests cover space/equals forms, validation, defaults, composition, error paths). - printWatchCmdHelp() — explains the --debounce trade-off (lower = snappier, higher = fewer cycles during git checkout / npm install) and points at 'codemap serve --watch' / 'codemap mcp --watch' as the killer combo (Tracer 3). - runWatchCmd() — bootstraps codemap (initCodemap + configureResolver), starts runWatchLoop with onChange = runCodemapIndex({mode: 'files', files: [...paths]}), awaits SIGINT/SIGTERM, drains pending edits before close. - main.ts: dispatch on rest[0] === 'watch'; bootstrap.ts: validateIndexModeArgs accepts 'watch'; printCliUsage lists 'Watch mode' between HTTP server and Targeted reads. Per-batch stderr line: 'codemap watch: reindex N file(s) in Mms' unless --quiet. Smoke verified: 'bun src/index.ts watch' boots, logs the bind line, drains cleanly on SIGTERM. Tracer 3 wires --watch into serve / mcp. * feat(watch): --watch flag on serve + mcp + shared createReindexOnChange helper (Tracer 3 of 5) Killer combo: codemap mcp --watch / codemap serve --watch boots the transport AND a co-process file watcher in one process. Removes the 'is the index stale?' friction agents hit today (per docs/plans/watch-mode.md § Agent-experience win). Factored helper to keep cmd-watch / mcp-server / http-server identical: - application/watcher.ts: createReindexOnChange({quiet, label?}) — opens DB, runs targeted reindex on the changed paths, logs 'reindex N file(s) in Mms' to stderr unless quiet, catches errors so a transient parse failure doesn't kill the loop. Caller passes a label so 'codemap mcp' / 'codemap serve' / 'codemap watch' lines are distinguishable in interleaved logs. - cmd-watch.ts now uses createReindexOnChange (DRY with the embedders). CLI surface: - cmd-mcp.ts: new --watch flag + --debounce <ms> override (default 250). Help text + parser tests + propagation through runMcpCmd → runMcpServer. - cmd-serve.ts: same flags. Parser tests + 'serve: ... (watch: on)' marker on the bind line. - CODEMAP_WATCH=1 / 'true' env shortcut for IDE / CI launches that can't easily edit the agent host's tool spawn command (per the plan's sketched API). Embedder lifecycle: - runMcpServer: starts watcher AFTER server.connect(transport); on shutdown awaits stopWatch() (drains pending reindex) before resolving. - runHttpServer: starts watcher AFTER listen succeeds; on SIGINT/SIGTERM awaits stopWatch() then closes the listener. 146 tests pass (cmd-watch + cmd-mcp + cmd-serve + watcher + mcp-server + http-server). No new code paths in the existing engines — just the boot-time wiring. * feat(watch): handleAudit skips incremental-index prelude when watcher is active (Tracer 4 of 5) Closes the wasted-I/O loop the plan called out: today MCP audit's default behavior is to run an incremental-index prelude (so 'head' reflects the on-disk source) — but with mcp --watch / serve --watch the watcher already keeps the index fresh, so the prelude is pure overhead. - application/watcher.ts: module-level watchActive flag toggled by runWatchLoop start/stop. isWatchActive() exposed for handleAudit; _resetWatchStateForTests + _markWatchActiveForTests for test seam. - application/tool-handlers.ts handleAudit: shouldRunPrelude = !args.no_index && !(isWatchActive() && args.no_index !== false). Hoisted to function scope so the inner finally can also pass it as the readonly hint to closeDb (avoids a wasted checkpoint pass). - Explicit no_index: false still forces the prelude even when watch is on (escape hatch for 'force re-index right now'). - 1 new watcher test + 1 new MCP-server integration test (audit succeeds with no_index unset when watcher is marked active — would have failed if the prelude tried to run on the test's freshly-created DB without git history). 62 watcher + MCP tests pass. * docs: sync README + architecture + glossary + roadmap + agents (Rule 10) + delete plan + changeset (Tracer 5 of 5) - README.md 'Daily commands' stripe: extended with codemap mcp --watch / serve --watch / watch standalone / CODEMAP_WATCH=1 examples. - docs/architecture.md: new 'Watch wiring' paragraph after MCP / HTTP wiring; covers chokidar selection, debounce + filter, audit prelude optimization. application/ table extended with watcher.ts. - docs/glossary.md: new 'codemap watch / watch mode' entry under ## C (alphabetically before 'codemap mcp' / 'codemap serve' since 'watch' < 'serve' but the entry naming convention puts it after the existing CLI verbs). - docs/roadmap.md: 'Watch mode for dev' line removed (shipped per Rule 2). - .agents/rules/codemap.md + templates/agents/rules/codemap.md (Rule 10): new 'Watch mode (live reindex)' table row + --watch / --debounce flags appended to the mcp + serve rows. - .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md: --watch / --debounce + CODEMAP_WATCH semantics on the MCP + HTTP server bullets; new 'Watch mode' bullet covering standalone vs combined shape choice and the audit prelude optimization. - .changeset/codemap-watch.md: minor changeset (new top-level CLI verb + new --watch flag on mcp + serve). - docs/plans/watch-mode.md: deleted on ship per docs-governance Rule 3. - src/{application/{watcher,mcp-server,http-server}.ts, cli/cmd-{mcp,watch}.ts}: replaced dangling cross-refs to the deleted plan with cross-refs to architecture.md § Watch wiring. * fix(watch): drain in-flight + prime gating + onError clears flag + 6 robustness fixes (CodeRabbit on #47) 9 of 10 CodeRabbit threads, all verified ✅ correct. The 10th (#4 — anchor) is ⚠️ partial: their suggested anchor (#watch-wiring) doesn't exist; #cli-usage is correct (precedent for every wiring paragraph). Pushing back with evidence in the reply. **Major correctness fixes:** - (#5, heavy) handleAudit treated 'watch active' as 'index definitely fresh' — but the watcher only sees NEW events, not historical drift. On boot before catch-up, audit could read a months-stale index. New onPrime opt on runWatchLoop runs an incremental catch-up BEFORE flipping watchActive=true. Embedders pass createPrimeIndex({label}) — same pattern as createReindexOnChange. Without onPrime, flag flips immediately (test-friendly default). - (#7, heavy) stop() didn't drain async reindex work — fire-and-forget meant a stop() could resolve while onChange was mid-DB-write. Now: serialize onChange via inFlight chain, await it on stop. Also await primingDone so we don't tear down a DB connection out from under the prime catch-up. - (#8) Backend onError left watchActive=true → handleAudit kept skipping prelude even when chokidar died. Now clears the flag. - (#1) http-server: watcher boot throw after listen() leaked the listener. try/catch closes server on failure. - (#2 + #10) http-server + cmd-watch: stopWatch().then(closeServer) never fired closeServer if stopWatch rejected — process hang on SIGTERM. Now .catch(log).finally(...) so progress is guaranteed. - (#3) mcp-server.test: _markWatchActiveForTests ran outside the try guard — a thrown makeClient() would leak the singleton flag into sibling tests. Hoisted into try/finally. **Minor:** - (#6) shouldIndexPath built recipe prefix with platform sep, but relPath is POSIX-normalized. Windows recipe edits got skipped. Fixed to literal '.codemap/recipes/'. - (#9) printWatchCmdHelp lacked JSDoc; added. **Push-back:** - (#4) CodeRabbit suggested changing #cli-usage anchor to #watch-wiring. Verified: there's no '## Watch wiring' heading — the wiring sections are bold-prefix paragraphs under '## CLI usage'. Their fix would 404. Keeping #cli-usage matches the precedent for MCP / HTTP / SARIF / serve wiring paragraphs. 5 new watcher tests cover the prime-gating race, onError flag clear, in-flight drain on stop (both auto-fire and flushNow paths). 152 tests pass total. * docs(watch): outside-diff + 2 nitpicks from CodeRabbit follow-up review on #47 All 3 verified ✅ correct (the inline 10 actionable were already addressed in 207c05d): - (outside-diff, mcp-server.ts:165) audit tool description still claimed prelude always runs first; updated to document watch-mode default ('default true-equivalent without watch, default false-equivalent with --watch active') and the 'pass no_index: false to force a re-index even when watch is active' escape hatch. - (nitpick, glossary.md:395) widened wording from 'codemap mcp audit' to 'audit tool ... on both transports' since the same skip applies to 'codemap serve --watch' POST /tool/audit. - (nitpick, cmd-serve.test.ts:14 + cmd-mcp.test.ts:11) replaced hardcoded debounceMs: 250 with imported DEFAULT_DEBOUNCE_MS so future default changes don't silently break tests.

…pshot date (#48) - fallow.md status snapshot date 2026-05-02 → 2026-05-03; new 'codemap watch' bullet under 'Adjacent shipped' covering the chokidar selection, three shapes, prime-gating + in-flight drain robustness fixes (CodeRabbit caught two heavy bugs on PR #47). - competitive-scan-2026-04.md: 'Watch mode (codemap watch) — still backlog' line replaced with shipped row pointing at PR #47 (impl) + PR #46 (plan). - Anchor cross-ref bumped to match the new fallow.md heading.

…(v1.x backlog) (#51) * docs(plan): codemap audit --base <ref> — worktree + reindex strategy (v1.x backlog) Plan for the next-best agent-value loop: PR-review structural-diff. Replaces today's 3-step --baseline dance with one verb. Reuses 90% of the existing audit infrastructure (PR #33); only new piece is the worktree+reindex snapshot path. Cache-by-resolved-sha; LRU 5/500 MiB; mutual-exclusive with --baseline; per-delta override compatible. Hard error on non-git projects (no graceful fallback — there's no meaningful 'ref' without git). Plan only — implementation follows after CodeRabbit review per the impact (#49→#50) / watch (#46→#47) workflow. * docs(plan): address CodeRabbit findings — atomic cache populate (D11), worktree-as-cache lifecycle clarity, TS type widening callout - D1/D2/D8 rewritten: worktree IS the cache entry (kept until LRU evicts); cleanup runs only on reindex failure rollback OR LRU eviction. The earlier ambiguity (D2 said 'cache by sha' while D8 said 'remove in finally') is resolved. - D11 added: atomic cache populate via per-pid temp dir + POSIX rename → free single-flight semantics; no lock files needed. Same pattern for eviction. Closes the race CodeRabbit flagged on concurrent CI matrix runs against the same sha. - AuditBase TS type widening to discriminated union called out explicitly above the Decisions table (Tracer 1 ships it). - CODEMAP_AUDIT_CACHE_SIZE env var mention dropped — was promising an unimplemented config knob; v1 hardcodes the limits, defer to v1.x+.

…age` table) (#56) * docs(plan): static coverage ingestion (Istanbul JSON → `coverage` table) Plans the C.11 candidate from `research/fallow.md` — `codemap ingest-coverage <path>` reads Istanbul `coverage-final.json` into two new tables (`coverage` symbol-level + `file_coverage` rollup), joinable to `symbols` for the killer "what's structurally dead AND untested?" recipe in one query. Resolves the open question from `fallow.md § 6` ("symbols column vs separate table?") in favour of a separate table with `ON DELETE CASCADE` (D1) — coverage shape evolves independently of structural columns; LEFT JOIN keeps NULL semantics explicit; rows survive `--full` reindex via the `query_baselines` precedent (D6). Key decisions: - Istanbul JSON in v1; LCOV in v1.x; raw V8 traces never (D3, fallow's paid moat). - One-shot `ingest-coverage` verb decoupled from `codemap` index runs (D4) — coverage cadence (per `bun test --coverage`) ≠ index cadence (per file edit). - Statement coverage only in v1 (D5); branch/function deferred until a consumer asks. - MCP/HTTP exposure as a query column, not a separate `coverage` tool (D9) — composes with every existing recipe + ad-hoc SQL. - `codemap audit --delta coverage` deferred to v1.x (D10) — raw schema first. Five-tracer plan: schema bump → engine → CLI verb → fixture + golden recipe → docs. Plan only — implementation follows after CodeRabbit review per the established workflow (PRs #46/47, #49/50, #51/52, #53/54). * docs(plan): fact-check fixes — drop hallucinated SQL/projection/runner claims Self-audit against the actual codebase surfaced four claims that didn't hold: 1. Killer recipe SQL referenced `callee_id` — `calls` is name-keyed (`callee_name TEXT`, no symbol-id FK; see `db.ts` `CallRow`). Rewrote the "no callers" predicate as `NOT EXISTS (… WHERE callee_name = s.name)`. 2. D7 claimed line-range projection is "the same `markers` already uses" — `markers` is line-pinned (`line_number INTEGER`), no projection. Reworded as "novel for this plan" with the actual mechanic spelled out. 3. D3 listed `bun test --coverage` as an Istanbul JSON emitter — `bun test --help` shows only `text` / `lcov` reporters today. Removed bun from the Istanbul-emitters list; left vitest/jest/c8/nyc with the explicit reporter flags they need. 4. D12 contradicted D6 ("rows absent until re-ingest" vs "rows survive `--full`"). Reconciled: empty is the correct initial state on first bump; subsequent bumps preserve via the `dropAll()` exclusion. Quoted the `lessons.md` policy verbatim instead of paraphrasing. * docs(plan): v2 — fix CASCADE hazard + innermost-wins projection + nits Self-grilling found two real schema design holes that would block execution: 1. **D6 CASCADE hazard.** Original draft keyed `coverage` on `symbol_id REFERENCES symbols(id) ON DELETE CASCADE`. Every `--full` reindex calls `dropAll()` → drops `symbols` → CASCADE wipes coverage, regardless of whether `coverage` itself was excluded from `dropAll()`. Recreated `symbols` get fresh auto-increment IDs anyway → coverage permanently lost without re-ingest. Fix: natural-key PK `(file_path, name, line_start)` — no FK to `symbols.id`. Survives the `symbols` drop-recreate cycle. Trade-off: orphan rows when files are deleted; cleaned by one explicit `DELETE FROM coverage WHERE file_path NOT IN (SELECT path FROM files)` after every ingest. 2. **D7 overlapping symbols.** Original draft: `line_start ≤ stmt_line ≤ line_end` matches every enclosing scope. With nested symbols (class methods inside classes, closures inside functions), one Istanbul statement projects onto 3+ symbols, inflating `total_statements` 2-3×. Fix: innermost-wins via `(line_end - line_start) ASC LIMIT 1`. New `skipped.statements_no_symbol` counter for statements that fall outside every symbol range (top-level expressions, side-effect imports). Nits cleared in the same pass: - D2: drop `file_coverage` rollup table from v1 (aggregateable via GROUP BY on the symbol-level table; doubling sources of truth without a benchmark is premature). Promote to v1.x with a real query. - D11: spec the `total_statements = 0 → coverage_pct IS NULL` edge case + document the cross-file name-collision lossiness in the killer recipe. - Drop `--prune` flag (orphan cleanup is unconditional, no flag needed). - Drop per-row `source` column (single meta key sufficient; one ingest at a time). - Update killer recipe SQL to use the natural-key 3-column join. - Drop made-up "~50 LoC LCOV ingester" estimate and "<50 ms / <1 ms / ~500 KB" performance numbers (no benchmark backed them). - Tracer 1 / 2 / 3 acceptance criteria updated to match the new schema. Plan is now ready for tracer-1 implementation. CodeRabbit pass deferred (rate-limited 57m). * docs(plan): tighten Bun-native API references (file read + perf note) Plan correctly inherits the established Node vs Bun runtime split, but the single tracer-3 reference understated it. Now: - Tracer 3 cites `packaging.md § Node vs Bun` as the canonical pattern source instead of pointing at config.ts in passing. - Performance section calls out the actual lever — `Bun.file(path).json()` uses Bun's native JSON parser, materially faster than V8 `JSON.parse` on multi-MB Istanbul payloads (real coverage files for medium codebases routinely hit several MB). No new Bun-native API surfaces are added — the feature doesn't need globbing, file writes, spawn, or hashing beyond what the existing engines already use through their abstractions. * docs(plan): v3 — ship LCOV in v1 + drop --source flag + bundle killer recipe The "fully capable, no half-way APIs" principle reshapes three things: 1. **LCOV ingester ships in v1** alongside Istanbul. Original draft deferred LCOV to v1.x, which would exclude `bun test --coverage` users — i.e. codemap's own primary runtime. That's the textbook half-baked surface the principle bans. Two parser front-ends share one `upsertCoverageRows` core; LCOV is regex tokenizing over `SF:` / `DA:` / `end_of_record`. Tracer 2 splits into 2a (shared core + Istanbul parser) and 2b (LCOV parser), both writing identical normalised CoverageRow[] into the same upsert path. 2. **`--source istanbul|lcov` flag dropped.** Auto-detection from extension (`.json` → istanbul, `.info` → lcov, directory → probe both, error on ambiguous) is unambiguous; a flag for "tell codemap what it can already see" is API noise. Misnamed files can be renamed (one-liner) cheaper than codemap can grow a flag. 3. **Killer recipe ships as bundled `untested-and-dead.{sql,md}`** in `templates/recipes/`. Per the recipes-as-content registry (PR #37), the high-value queries become first-class agent surface. A buried doc snippet would be invisible to agents at session start; the bundled recipe shows up in `--recipes-json` and gets a `codemap query --recipe untested-and-dead` direct invocation. Tracer 4 also fans out: Istanbul + LCOV fixtures cover the same partial coverage shape; three golden recipes (`coverage-istanbul.json`, `coverage-lcov.json`, `untested-and-dead.json`) prove format equivalence. Out-of-scope, alternatives, performance section, title, and goal statement all updated to match. * docs(plan): v4 — agent-journey audit + bundled recipe shelf (D13) Walked every D / OOS / tracer item against "fully capable + agent first-class + no half-baked APIs". Found three half-baked surfaces: 1. **D2 deferral leaks "compose GROUP BY yourself" onto the agent.** Deferring the `file_coverage` table is correct (no benchmark proves it's needed) — but the agent-facing answer for "rank files by coverage" was missing. Fix: keep table deferral, ship a bundled `files-by-coverage.{sql,md}` recipe so the GROUP BY view IS first-class. 2. **D11 name-collision lossiness was acknowledged but unmitigated.** The killer recipe's `callee_name = s.name` cross-file lossiness was documented in the recipe SQL comment, but the recipe `.md` didn't give the agent any narrowing pattern. Now D11 ships three concrete narrowing patterns in the `.md` (file_path scope, default- export filter, exported-only restriction) so the agent has workable mitigations on day one. 3. **Missing recipe shelf for common agent questions.** Walking the journey: only "What's structurally dead AND untested?" had a recipe; "Rank files by coverage" and "Worst-covered exported symbols" forced ad-hoc SQL. Three recipes fully cover the agent journey end-to-end. New D13 codifies the bundled-recipe principle: every common agent question gets a `--recipe` verb. Three v1 recipes: - `untested-and-dead.{sql,md}` (killer, with name-collision mitigations) - `files-by-coverage.{sql,md}` (replaces D2's table deferral) - `worst-covered-exports.{sql,md}` (top-N agent ask) Each `.md` carries a frontmatter `actions` block (per PR #26) so agents get per-row follow-up hints. All three appear in `--recipes-json` automatically — agents discover them at session start. New "Agent journey" section makes the principle visible: a table mapping every common agent question to the v1 verb that answers it. If a row ever shows "compose SQL yourself" without a recipe, the surface is half-baked and needs a recipe before tracer 1 ships. Tracer 4 expanded: ships all three recipes + five golden snapshots (adds files-by-coverage.json + worst-covered-exports.json on top of the three existing). Tracer 5 expanded: glossary + agent rule trigger table gain three new rows. Plan now passes the principle audit end-to-end.

…plugin scope) Grill-me Q6 outcome (and accounting cleanup): three of five § 6 open questions are now resolved by prior grill outcomes — § 6 needs to reflect that, not pretend they're still open. Resolutions captured: - Q1 (daemon-default for mcp/serve) — RESOLVED THIS GRILL TURN. Default --watch ON for both modes; opt-out via --no-watch / CODEMAP_WATCH=0. One-shot CLI defaults preserved (no watcher on query/show/snippet). Receipts: stale-index = #1 agent UX complaint (fallow.md § 6); chokidar lazy startup validated tiny by PR #46 6-watcher audit. Flip is a small follow-up PR (flag default + test + patch changeset + agent rule update per docs/README.md Rule 10). AST-caching measurement parked downstream of the flip. - Q3 (LSP shim vs standalone) — RESOLVED in § 2.5 reframe earlier this grill (commit 0b9d878). Thin shim wrapping shipped engines; no engine (would duplicate moat B substrate). Standalone deferred to "if VSCode-extension demand emerges." - Q4 (C.9 plugin contract scope) — RESOLVED via § 5 (b) plan-PR pre-locked decisions (commit 6f845ba). Entry-point hints only for v1; arbitrary edge injection deferred to v2. Static config only per § 3 ergonomic "no JS exec at index time" floor. § 6 restructured: "Resolved (2026-05)" subsection at top with full rationale + receipts; "Still open" subsection below with Q2 (FTS5 default) and Q5 (history table) — the only two genuinely-open questions left. § 2.4 verdict updated to point at the resolved § 6 Q1 anchor instead of the open-question wording. Anchor preservation: external links (#6-open-questions) still resolve to the section heading. New internal anchor (#resolved-2026-05) used by § 2.4 verdict — single inbound link, no external citations to break.

User reframe: codemap is the only SQL-based code index in the market; inspiration comes from the free and open internet (LSP spec, SQLite docs, AST tooling), not code-by-code cloning of any peer tool. Drop fallow as a yardstick throughout. Vital information preserved (per "don't lose any vital information that is used to execute the plan"): - Closed-dead-subgraph motivator for C.9 — kept as an abstract pattern description in § 2.3 caveat (N-file packs with self-imports, non- zero fan-in, none reachable from real entry). Was previously cited to fallow.md § 0; now stands on its own merit. - LSP read-side capabilities (show / impact / watch) — kept; LSP spec upstream is now the protocol authority instead of fallow's crates/lsp/. - Runtime-tracing scope distinction — § 3 floor reframed to anchor on "different product class entirely" (live process data vs static analysis) instead of "fallow's paid moat." - Predicate-as-API moat (A) — kept; justification now anchors on intrinsic merit (SQL is durable, agents compose any predicate) rather than "fallow ships verdicts; we don't." - Schema-breadth moat (B) — kept; justification now "codemap-specific extractions; their richness directly determines what JOINs are expressible" rather than "fallow has none of these." Section-by-section changes: - HEADER — "Companion docs / Source for deep-dives" replaced with "Companion doc" (competitive-scan only) + "Positioning" paragraph declaring structural uniqueness. - § 2.3 original-framing quote — paraphrased to drop the "(e.g. fallow, knip, jscpd)" parenthetical; pointers to roadmap.md for the full original wording. (roadmap.md itself still has the parenthetical; separate-PR scope.) - § 2.3 caveat — closed-dead-subgraph case described abstractly; no source citation needed. - § 2.5 LSP shim — "fallow has crates/lsp/" → "LSP spec upstream is the protocol authority." - § 3 intro — mission framing rewritten; "equal/surpass fallow" language replaced with "extract maximum value from the SQL-index architecture; grow the ecosystem" + "only SQL-based code index in the market" positioning. - § 3 Moat A — anchored on intrinsic merit (SQL durable + agent composability) instead of fallow comparison. - § 3 Moat B — anchored on "substrate every recipe layers on; richness determines JOIN expressivity" instead of "fallow has none of these." - § 3 ergonomic floors — dropped all "fallow is also fast" / "Convergent with fallow" annotations; reframed runtime-tracing as "different product class entirely (live process data, not static analysis)" + reframed telemetry-upload as standalone safety promise. - § 4 — DELETED ENTIRELY ("What to inspect in the fallow source tree"). Replaced with "Inspiration sources for plan-PR authoring" table listing open specs / primitive sources only (LSP spec, SQLite docs, oxc node reference, Lightning CSS, JSON-RPC + MCP spec, TC39 proposals, existing codemap surface, internal third-party graph audits). Discipline statement preserved: every plan PR cites the spec / primitive source it took inspiration from. - § 5 (d) row + T-table T+5w → +7w cell — dropped fallow crates/lsp/ refs; LSP spec is now the named authority. - § 6 Q1 — dropped fallow.md § 6 citation; stale-index frequency now anchored on PR #46 + PR #56 internal evidence. - § 6 Q4 — dropped fallow.md § 0 + § 6 citations; closed-dead-subgraph case cross-refs § 2.3 caveat instead. - § 7 cross-references — removed research/fallow.md and fallow upstream entries. Added § 4 inspection list as a self-reference. - § 8 errata § 2.3 row — dropped fallow.md citation; pattern described inline. Net effect: the doc stands on codemap's intrinsic structural properties. No peer-tool framing remains. The mission is now self-coherent: extract max value from the SQL-index architecture + grow the ecosystem, anchored on the unique-in-market positioning.

@deprecated

…quence (2026-05) (#58) * docs(research): non-goals reassessment + fallow clone deep-dive map (2026-05) Companion to research/fallow.md (capability tracker — what to adopt FROM fallow). This new doc inventories what THIS codebase already unlocks that the current Non-goals (v1) list forbids, post-C.11. User observation: many non-goals were defensive choices made when the project was 1/10th its current size, then carried forward unchallenged as the surface grew (15+ recipes, 12+ tables, 3 engines, watch mode, coverage, audit, impact). The reframe: stop asking "what should we not do?" and start asking "what does the SQL-index-with-three-transports actually unlock that no other tool does?" Findings: §1 — 10 first-class agent capabilities sitting in unwritten JOINs / formatters / verbs (components-touching-deprecated, unimported-exports, complexity per symbol, refactor-risk-ranking, boundary violations, unused type members, Mermaid output, MCP file/symbol resources, recipe usage telemetry, rename --dry-run preview). §2 — Five non-goals worth challenging: - "No FTS5 / use ripgrep" — SQLite ships FTS5; ripgrep loses JOIN composition (TODOs inside @deprecated functions in <50% covered files is one query, vs three tools today). - "No visualisation" — conflates rendering pixels with shaping render- ready data; Mermaid / D2 are JSON-shaped formatters (sibling of SARIF). - "No static analysis" — we already ship deprecated-symbols, untested- and-dead, barrel-files, fan-in/out; the line was rhetorical. Real boundary is "no opinionated rule engine, no fix mutation". - "No persistent daemon" — we have one (mcp --watch, serve --watch, watch); non-goal preserves a constraint that no longer exists. - "No LSP replacement" — show + impact + watch is 80% of LSP read-side; ship a thin shim consuming existing engines, don't write an LSP. §3 — Real architectural limits worth keeping (sub-100ms cold-start CLI, no LLM in box, no fix engine, no runtime tracing, no JS exec at index time). §4 — Map of /Users/sutusebastian/Developer/OSS/fallow clone deep-dive points: which crates / docs / configs to inspect before each shipped feature so we adopt patterns rather than reinvent. Cite-the-source-path discipline mirrors the existing research/fallow.md cite-the-PR habit. §5 — Recommended sequence: (a) FTS5 + Mermaid one-PR non-goal flip → (c) complexity column → (b) C.9 plugin layer (multi-tracer big surface) → (d) LSP shim. (a) is the cheapest non-goal flip; ships a confidence move before the bigger surfaces. §6 — 5 open questions (daemon-by-default for MCP/HTTP, FTS5 opt-in, LSP shim vs standalone, plugin contract scope, history table shape). Doc-governance compliance: - Goes in docs/research/ per Rule 3 (research-class doc). - Cross-references roadmap, why-codemap, fallow.md, competitive-scan per Rule 5. - Doesn't duplicate non-goals (Rule 1) — proposes amendments to be applied when § 2 items ship, in lockstep with why-codemap per the Single source of truth table. - No inventory counts in narrative (Rule 6) — uses qualitative "15+ recipes / 12+ tables" only. * docs(research): triangulate non-goals reassessment vs descriptive baseline User cross-checked my prescriptive doc (non-goals-reassessment-2026-05.md) against composer-2-fast's descriptive baseline (codemap-capability- surface-2026-05.md) plus the codebase as source of truth. Found three factual errors in mine; baseline doc held up clean. Corrections applied: 1. § 1.2 (Exports never imported): codebase has `exports.re_export_source` column — original doc missed it. Re-exports require a JOIN through that column to avoid false positives on barrel-only exports. Effort bumped XS → S. 2. § 1.3 (Cyclomatic complexity): claimed "AST walker already counts nodes during parse" — false. `rg 'complexity|node_count|nodeCount' src/` returns zero matches. Node-counting is NOT in place; needs an extension to the AST walker in src/parser.ts. Effort bumped S → M. 3. § 2.3 ("no static analysis" non-goal): listed `fan-in` and `fan-out` as "static analysis we already ship" — too loose. Per `fan-in.sql` (`ORDER BY fan_in DESC LIMIT 15`) they're hotspot rankers, not orphan / dead-code detectors. They don't cover the closed-dead- subgraph case from research/fallow.md § 0 (8-file pack with non- zero fan-in via self-import). That gap motivates C.9 framework plugin layer, not the "no static analysis" flip. Caveat now spelled out in the doc. Header updated: this doc is the **prescriptive** lens; the **descriptive baseline** lives in codemap-capability-surface-2026-05.md (read first). Cross-references list and § 8 errata block document the diff between v1 and v2 so future reviewers can see what changed and why. Process lesson encoded in § 8: every prescriptive research note should triangulate against a descriptive baseline (own doc or peer model) before recommending a ship sequence. Caught all three errors before they propagated into a plan PR. * docs(research): scrub local user paths from non-goals doc + new lesson User caught absolute-path leaks in the research note pointing at the fallow clone on the maintainer's machine. Three references replaced with the public upstream URL (https://github.com/fallow-rs/fallow): - Header "Local clone for deep-dives" → "Source for deep-dives" - § 4 heading "What to inspect in the local fallow clone" → "...in the fallow source tree" - § 7 cross-references "Local fallow clone — /Users/..." → "fallow upstream" Also adds a new general-purpose lesson to .agents/lessons.md: Never commit absolute local user paths — no /Users/<name>/…, /home/<name>/…, ~/…, or file:/// URIs in any tracked doc, code, comment, or PR body. Pattern: cite https://github.com/<org>/<repo> for upstream sources; repo-relative paths for in-tree references. Sibling to the existing "PR bodies via temp file" lesson — same family (committed strings need to be portable + non-leaking), different surface. * docs(lessons): add 'never commit local user paths' lesson (PR #58 catch) * docs(research): delete codemap-capability-surface-2026-05.md (existence test) Per docs/README.md existence test, this doc fails 3 of 4 criteria: - ❌ Doesn't document durable policy unavailable elsewhere — every fact reproducible from db.ts / builtin.ts / audit-engine.ts / --recipes-json - ❌ Doesn't track open work — pure snapshot - ❌ No unique historical context git log + architecture.md can't reconstruct - ✅ Cited by another doc (only because non-goals-reassessment cited it) Plus Rule 1 violation (duplicates architecture.md § Schema) and Rule 6 violation (hardcodes "15 recipes" / "9 of 15 ship actions" inventory counts in narrative). The real value the doc delivered was the **triangulation discipline** — catching 3 errors in non-goals-reassessment v1. That discipline is the durable artifact, not the doc. Codified in two places: 1. non-goals-reassessment § 8 errata + process lesson (kept) 2. .agents/lessons.md — new lesson explicitly bans the "dual descriptive + prescriptive doc" pattern as a Rule 1 violation. Right discipline: pin every concrete claim in the prescriptive doc itself, or self-audit against the canonical home before committing. Don't ship a parallel descriptive doc. non-goals-reassessment header + § 7 + § 8 updated to drop the now-deleted companion-doc references and point at canonical sources directly (architecture.md § Schema, db.ts, builtin.ts, audit-engine.ts V1_DELTAS). * docs(research): align § 5 (c) effort with § 1.3 / § 8 (M, not S) CodeRabbit caught § 5 row (c) "Cyclomatic complexity column" listing effort S, while § 1.3 + § 8 errata both list M (the v1→v2 bump after `rg 'complexity|node_count|nodeCount' src/` returned zero — node- counting isn't already in place; the AST walker in src/parser.ts has to be extended). Effort propagation gap from the v2 errata pass. § 5 row (c) updated to M; "Why" cell now spells out the AST-walker dependency inline so future readers don't re-litigate the figure. * docs(research): split § 3 into moat (load-bearing) vs ergonomic limits Grill-me Q1 outcome (under "extract max from SQL-index + equal/surpass fallow" mission): the original § 3 list conflated ergonomic floors (sub-100ms cold-start, no LLM, no JS at index time) with the actual moats. Most of the original entries are floors fallow also follows; they're not differentiators. The two real moats that needed naming as load-bearing limits: A. SQL is the API — every capability is a recipe (saved query) or a primitive recipes can compose. Verdicts are an OUTPUT mode (--format sarif, audit deltas), never a primitive. Reviewer test: "is this verb also expressible as query --recipe <id>?" B. Extracted structure ≥ verdicts — schema breadth (CSS, markers, type_members, calls.caller_scope, components.hooks_used) is what equals/surpasses fallow on agent-facing capability per fallow.md § 5. Reviewer test for any "drop column X" PR: "what recipe (bundled or hypothetical) does this kill?" Both are now load-bearing rows above the ergonomic ones. The original five preferences are kept verbatim but annotated with their relation to the moat (floor / convergent / adjacent / rivalrous / safety). Eroding either A or B is the most likely path from "codemap" to "fallow with extra steps" — § 3 now equips a reviewer to spot it. * docs(research): § 5 ship sequence — parallel plan-PR for (b) at T+0 Grill-me Q2 outcome (under "equal/surpass fallow" mission): the "cheapest non-goal flip first" ordering was a small-team confidence move, but the § 3 moat rewrite already paid that confidence cost. The real risk under the actual mission is the deferral trap — XL items become "next quarter" while every new recipe inherits the noisy substrate (untested-and-dead's Next.js page.tsx false-positive class). Hybrid resolved: - Shipping cadence stays (a) → (c) → (b) impl → (d). - (b) plan PR opens at T+0, iterates in parallel during (a)+(c). - Plan opens with ~30% of decisions pre-locked: entry-point hints only per Grill Q4, static config only per § 3 "no JS exec at index time" ergonomic limit. Not a blank-slate plan — structured from day 1. Added a 5-row T-table in § 5 spelling out the parallel tracks. (b)'s "Why" cell now names the deferral trap explicitly; (d)'s "Why" pins its dep on (b) impl (not just (b)). Rationale list updated to flag that the moat rewrite paid the confidence move so (a) doesn't pay it again. Cost-if-abandoned escape hatch: plan PR can close as "Status: Rejected (YYYY-MM-DD)" per docs/README.md Rule 8. Design surface captured either way. * docs(research): § 2 reframed via § 3 moats (taxonomy + verdict cross-refs) Grill-me Q3 outcome: § 2's five flips inherited their shape from "original non-goals worth challenging" — but after § 3 locked in the moats, that shape conflated three different categories: - Moat-extending flips (2.1 FTS5, 2.3 static analysis) — substrate growth inside moat B - Moat-aligned flip (2.2 output formatters) — verdicts as output mode per moat A - Moat-orthogonal transport flips (2.4 daemon, 2.5 LSP shim) — neither moat is touched; flipping just re-exposes existing substrate Anchors preserved (2.1-2.5 stay) — anchor-preservation discipline per docs-governance § 3 / docs/README.md Rule 7. No cascading link updates needed in § 3 / § 4 / § 5 / § 8. Changes per section: - § 2 header — added a reading note naming the three categories and pointing each flip at the moat row it relates to. - § 2.3 — verdict no longer restates "no opinionated rule engine + no fix engine" (now canonical in § 3 moat A + ergonomic row); instead cross-references and names the static-analysis category as in-scope. Closed-dead-subgraph caveat preserved (it's the C.9 motivator). - § 2.4 — added "Moat relation: orthogonal" subsection naming the transport / process-model framing. AST-caching capability claim preserved + cross-linked to § 6 Q1. Verdict points the daemon-default question at § 6 Q1 explicitly (single canonical home). - § 2.5 — replaced the unmeasured "80% of LSP read-side" claim with a structural argument: shim wraps shipped engines (show / impact / watch) via stdio without re-extracting structure; an LSP *engine* would duplicate moat B substrate (the actual reason not to build one). Cited application/show-engine.ts + application/impact-engine.ts as the substrate the shim wraps. - § 6 Q1 — enriched with the AST-caching downstream measurement note lifted from § 2.4 (single canonical home for the daemon-default decision; § 2.4 cross-refs here). Vital-info preservation audit: - ✅ Closed-dead-subgraph caveat (8-file widget pack via fallow.md § 0) — kept verbatim in § 2.3 caveat block. - ✅ AST-caching capability claim — kept in § 2.4 "Capability unlocked" + cross-linked from § 6 Q1. - ✅ Watch-mode receipts (codemap watch / mcp --watch / serve --watch) — kept verbatim in § 2.4 "What's actually true". - ✅ Fan-in/fan-out hotspot-rankers framing — kept verbatim in § 2.3 caveat (with errata cross-ref to § 8). - ✅ Fallow `crates/lsp/` cross-ref — kept in § 2.5. Dropped (intentional): - "80% of LSP read-side" — unmeasured; replaced with structural argument that doesn't need a measurement. * docs(research): § 1.7 Mermaid — bounded-input contract (moat A) Grill-me Q4 outcome: § 1.7's "What's needed" cell was loose ("new --format mermaid formatter") — true but underspecified. Real-project edge counts on dependencies / calls are 1k-10k+; rendering them is either Mermaid-choking or a hairball, and silently auto-truncating (or "best-effort") would be a verdict-shaped affordance masquerading as an output mode — violates moat A. Locked in: - Allow on: impact engine output (depth-bounded), LIMIT N-shipped recipes (fan-in / fan-out), ad-hoc SQL with explicit LIMIT ≤ 50. - Reject (with scope-suggestion message) on unbounded inputs. - No auto-truncation — that's a verdict (recipe author's job to scope). Threshold (50 edges) is configurable; chosen as a default-readable upper bound for chat-client rendering. Calibrate during (a) impl PR against fixtures/golden / external corpus. DX framing: hairballed Mermaid in MCP / Cursor / Slack chat clients renders as garbage; a clear error naming knobs (LIMIT / --via / WHERE from_path LIKE) is the better consumer signal. This keeps Mermaid an output mode (moat A clean) and forces recipe authors to scope graphs — correct because they own the structural meaning of the result set. * docs(research): § 1.10 rename — recipe-shape (moat A) + parametrised recipes Grill-me Q5 outcome: § 1.10's verb-shape ("codemap rename <old> <new> --dry-run") was downstream of the OLD § 3 ("no fix engine" as a top- level non-goal). After the moat reframe, the actual test is moat A: verdict-shape vs recipe-shape. Verb hides every implicit rename choice (visibility filter, type-only re-exports, test files, aliases) inside argv parsing — not auditable. Recipe-shape puts those choices in reviewable SQL. Locked in: - Bundled recipe rename-preview.sql with --params key=value substitution (?-placeholder binding via db.ts prepared statements). - --format diff output mode (sibling of --format mermaid per item 1.7; same "rows in, renderable text out" pattern). - No new verb / engine / MCP tool / HTTP route. SQL stays the API. - Effort drops M → S. Cross-cutting infrastructure unlocked: parametrised recipes is net-new plumbing but pays for itself on the first downstream use. Already- visible follow-ons captured in the new "Cross-cutting infrastructure unlocked by item 1.10" paragraph at the end of § 1: - delete-symbol-preview, extract-function-preview, inline-symbol- preview — same recipe-shape pattern; all gated on the same plumbing. - Parametrising existing static recipes (untested-and-dead --params min_coverage=80 instead of hardcoded < 80) — cleanup opportunity the same plumbing enables. This is the second moat-A demonstration in two adjacent grill rounds (after § 1.7's bounded-input contract on Mermaid). Both prove the "verdicts are output mode, recipes are the API" framing on real capabilities — exactly what the (a) plan-PR will need to point at when reviewers ask "what changed?". * docs(research): § 6 — close Q1 (daemon-default), Q3 (LSP shape), Q4 (plugin scope) Grill-me Q6 outcome (and accounting cleanup): three of five § 6 open questions are now resolved by prior grill outcomes — § 6 needs to reflect that, not pretend they're still open. Resolutions captured: - Q1 (daemon-default for mcp/serve) — RESOLVED THIS GRILL TURN. Default --watch ON for both modes; opt-out via --no-watch / CODEMAP_WATCH=0. One-shot CLI defaults preserved (no watcher on query/show/snippet). Receipts: stale-index = #1 agent UX complaint (fallow.md § 6); chokidar lazy startup validated tiny by PR #46 6-watcher audit. Flip is a small follow-up PR (flag default + test + patch changeset + agent rule update per docs/README.md Rule 10). AST-caching measurement parked downstream of the flip. - Q3 (LSP shim vs standalone) — RESOLVED in § 2.5 reframe earlier this grill (commit 0b9d878). Thin shim wrapping shipped engines; no engine (would duplicate moat B substrate). Standalone deferred to "if VSCode-extension demand emerges." - Q4 (C.9 plugin contract scope) — RESOLVED via § 5 (b) plan-PR pre-locked decisions (commit 6f845ba). Entry-point hints only for v1; arbitrary edge injection deferred to v2. Static config only per § 3 ergonomic "no JS exec at index time" floor. § 6 restructured: "Resolved (2026-05)" subsection at top with full rationale + receipts; "Still open" subsection below with Q2 (FTS5 default) and Q5 (history table) — the only two genuinely-open questions left. § 2.4 verdict updated to point at the resolved § 6 Q1 anchor instead of the open-question wording. Anchor preservation: external links (#6-open-questions) still resolve to the section heading. New internal anchor (#resolved-2026-05) used by § 2.4 verdict — single inbound link, no external citations to break. * docs(research): § 6 Q2 closed — FTS5 default-OFF, both config + CLI Grill-me Q7 outcome: § 6 Q2 (FTS5 opt-in vs default-on) resolved. Locked in: - Toggle: BOTH codemap.config.ts `fts5: true` AND --with-fts CLI flag at index time. Config-only forces CI / ephemeral workflows to commit fts5: true to a config file; CLI-only forces long-term users to remember the flag on every --full. Cheap to support both. - Default: OFF. Backwards-compat — existing users wouldn't see .codemap/index.db grow ~30-50% silently on next --full. - Re-evaluate default in v2 once external-corpus size measurements land (bun run benchmark:query shape). Bug fix in § 2.1: the "off by default to keep cold-start sub-100ms" framing was a WRONG REASON. FTS5 is index-time cost only; cold-start reads existing DB and the virtual table doesn't slow startup. Real reason for default-OFF is index size growth. § 2.1 verdict updated to reflect this; § 6 Q2 resolution explicitly calls out the wrong-reason correction so future readers see the diff. Principle pinned: default-ON is reserved for capabilities without disk-size tax (Mermaid output, parametrised recipes, complexity column). FTS5 is the disk-tax exception. Tree state after this commit: - § 6 Q1 (daemon-default) — resolved - § 6 Q2 (FTS5 default) — resolved - § 6 Q3 (LSP shape) — resolved - § 6 Q4 (plugin scope) — resolved - § 6 Q5 (history table) — STILL OPEN (defer-bias confirmed by doc) * docs(research): § 6 Q5 closed — history table deferred + full grill findings Grill-me Q8 outcome: § 6 Q5 (history table) resolved as DEFERRED, with the full grill analysis preserved inline so the next reviewer doesn't have to re-derive why we said no. Findings captured: - WHAT it would do — point-in-time index gains a temporal dimension ("when did symbol X get @deprecated?", "coverage trend over 50 commits", "files that became dead this week"). - WHAT audit --base <ref> already covers — pairwise diff serves the most-common temporal question (PR-scoped delta) with no schema growth. Longitudinal "evolved over commits 1..N" is the unfilled gap. - TWO SHAPES table — per-commit snapshots (~25 GB on 500-commit retention; trivial query cost) vs append-only event log (~5-25 MB deltas; heavy recursive-CTE query cost). - BACKFILL COST — N reindexes (~30s each = ~4 hrs first-run for 500 commits) is the same for both shapes; deal-breaker today. - ARCHITECTURE IMPACT — schema bump (minor per pre-v1 lesson), db.ts + indexer hooks, retention policy config, deeper git integration. - WHY DEFER — anti-bloat meta-rule (no recipe demands it); audit --base covers common case; backfill prohibitive without paying use case; shape-decision wasted without empirical access patterns. - REVISIT TRIGGERS — TWO consumers shipping jq-based "audit runs over time" workflows (mirrors B.5 verdict-threshold deferral pattern), OR query_baselines evolution becoming a recurring agent need. The full analysis is now inline in § 6 Q5 (~30 lines + cost table). Per user request: don't lose vital information; document grilling findings for fuller context. Future reviewers see the full reasoning, not just "deferred" — same posture as § 8 errata's "future readers can see the diff between v1 and v2." § 6 status after this commit: ALL FIVE OPEN QUESTIONS RESOLVED. Q1 (daemon-default), Q2 (FTS5 default), Q3 (LSP shape), Q4 (plugin scope), Q5 (history table) — every decision the doc was authored to force is now pinned with rationale and revisit triggers (where applicable). * docs(research): § 1.9 reframe + § 3 "No telemetry upload" floor Grill-me Q9 outcome: § 1.9's "Recipe usage telemetry" framing was a gotcha. The word "telemetry" carries upload / aggregation / surveillance connotations that don't match the actual capability (purely local recency tracking) — and would either get the feature rejected sight-unseen by privacy-conscious users / corp installations OR silently set up substrate for a future "phone home" PR without an explicit non-goal saying we won't. Renamed + tightened § 1.9: - "Recipe usage telemetry" → "Local recipe-recency tracking". - Table renamed recipe_usage → recipe_recency (named after the value, not the act). - Added 90-day retention bound (caps unbounded growth via per-reindex pruning). - Added opt-out config (`recipe_recency: false` skips the reconciler). - --recipes-json surface spec'd: {recipe_id, last_run_at, run_count_90d}. - Naming-note paragraph explains why "telemetry" was rejected. New § 3 ergonomic floor row "No telemetry upload": - Locks in the privacy posture explicitly. No HTTP-out primitive in codebase today (grep-able), but the floor exists to resist accumulation pressure — a future "anonymous opt-in usage stats to help prioritize recipes" PR would look reasonable without an explicit floor. - Convergent with fallow (probably also doesn't upload) — floor, not moat. - Cross-references item 1.9 as the only usage-data feature; consumers can audit the .codemap/index.db location + retention bound. Lockstep update needed when item 1.9 ships: docs/why-codemap.md "What Codemap is not" gains "Codemap never uploads usage data" per docs/README.md Rule 10. Already cross-referenced in § 7 of this doc. * docs(research): drop all fallow framing — codemap is structurally unique User reframe: codemap is the only SQL-based code index in the market; inspiration comes from the free and open internet (LSP spec, SQLite docs, AST tooling), not code-by-code cloning of any peer tool. Drop fallow as a yardstick throughout. Vital information preserved (per "don't lose any vital information that is used to execute the plan"): - Closed-dead-subgraph motivator for C.9 — kept as an abstract pattern description in § 2.3 caveat (N-file packs with self-imports, non- zero fan-in, none reachable from real entry). Was previously cited to fallow.md § 0; now stands on its own merit. - LSP read-side capabilities (show / impact / watch) — kept; LSP spec upstream is now the protocol authority instead of fallow's crates/lsp/. - Runtime-tracing scope distinction — § 3 floor reframed to anchor on "different product class entirely" (live process data vs static analysis) instead of "fallow's paid moat." - Predicate-as-API moat (A) — kept; justification now anchors on intrinsic merit (SQL is durable, agents compose any predicate) rather than "fallow ships verdicts; we don't." - Schema-breadth moat (B) — kept; justification now "codemap-specific extractions; their richness directly determines what JOINs are expressible" rather than "fallow has none of these." Section-by-section changes: - HEADER — "Companion docs / Source for deep-dives" replaced with "Companion doc" (competitive-scan only) + "Positioning" paragraph declaring structural uniqueness. - § 2.3 original-framing quote — paraphrased to drop the "(e.g. fallow, knip, jscpd)" parenthetical; pointers to roadmap.md for the full original wording. (roadmap.md itself still has the parenthetical; separate-PR scope.) - § 2.3 caveat — closed-dead-subgraph case described abstractly; no source citation needed. - § 2.5 LSP shim — "fallow has crates/lsp/" → "LSP spec upstream is the protocol authority." - § 3 intro — mission framing rewritten; "equal/surpass fallow" language replaced with "extract maximum value from the SQL-index architecture; grow the ecosystem" + "only SQL-based code index in the market" positioning. - § 3 Moat A — anchored on intrinsic merit (SQL durable + agent composability) instead of fallow comparison. - § 3 Moat B — anchored on "substrate every recipe layers on; richness determines JOIN expressivity" instead of "fallow has none of these." - § 3 ergonomic floors — dropped all "fallow is also fast" / "Convergent with fallow" annotations; reframed runtime-tracing as "different product class entirely (live process data, not static analysis)" + reframed telemetry-upload as standalone safety promise. - § 4 — DELETED ENTIRELY ("What to inspect in the fallow source tree"). Replaced with "Inspiration sources for plan-PR authoring" table listing open specs / primitive sources only (LSP spec, SQLite docs, oxc node reference, Lightning CSS, JSON-RPC + MCP spec, TC39 proposals, existing codemap surface, internal third-party graph audits). Discipline statement preserved: every plan PR cites the spec / primitive source it took inspiration from. - § 5 (d) row + T-table T+5w → +7w cell — dropped fallow crates/lsp/ refs; LSP spec is now the named authority. - § 6 Q1 — dropped fallow.md § 6 citation; stale-index frequency now anchored on PR #46 + PR #56 internal evidence. - § 6 Q4 — dropped fallow.md § 0 + § 6 citations; closed-dead-subgraph case cross-refs § 2.3 caveat instead. - § 7 cross-references — removed research/fallow.md and fallow upstream entries. Added § 4 inspection list as a self-reference. - § 8 errata § 2.3 row — dropped fallow.md citation; pattern described inline. Net effect: the doc stands on codemap's intrinsic structural properties. No peer-tool framing remains. The mission is now self-coherent: extract max value from the SQL-index architecture + grow the ecosystem, anchored on the unique-in-market positioning. * docs(research): retract uniqueness claim — honest cohort positioning Fact-check finding: the "structurally unique — only SQL-based code index in the market" claim doesn't hold. Web search + verification surfaced a real cohort of SQLite-backed code indexers for AI agents: - srclight (29 stars) — SQLite FTS5 + tree-sitter + embeddings + MCP, 42 tools, 11 langs. Pitch identical to codemap's ("AI agents spend 40-60% tokens on orientation; we eliminate this"). - Sverklo (30 stars) — local-first MCP, symbol graph, blast-radius, open-source alternative to Greptile/Sourcegraph. - ctxpp / ctx++ (17 stars) — Go MCP, tree-sitter, SQLite + FTS + vector, blast-radius analysis (= codemap's impact). - KotaDB (99 stars) — TS + Bun + SQLite — IDENTICAL stack to codemap. - codemogger (2026) — MCP, tree-sitter, SQLite + FTS + vector, semantic search. - @squirrelsoft/code-index, QuickAST, code-scale-mcp, CodeAgent Indexing Engine, Polyglot Indexer MCP, Continue's CodeSnippetsIndex — all SQLite-backed code indexers with overlapping surface. Codemap is one of ~10+, NOT unique. Retracting the claim. Honest differentiation (after verification): 1. Predicate-as-API — peers ship pre-baked verbs / MCP tools; codemap exposes raw SQL + recipes. Genuinely rare in the cohort. 2. Pure structural — no embeddings, no LLM in box. Most peers add vector search by default. Genuine differentiation. 3. JS/TS/CSS-ecosystem-deep extraction — CSS variables/classes/ keyframes, React components.hooks_used, type_members, markers. Peers focus on cross-language symbol+call surface via tree-sitter. The depth axis (3) is structurally enabled by parser choice — oxc (JS/TS) and lightningcss (CSS) are Rust-based and ecosystem- specialized; peers using tree-sitter trade depth for breadth. Where codemap is BEHIND the cohort (not hidden): multi-language support (codemap = TS/JS/CSS only; peers = 10-15 langs), star count, embeddings/semantic search, market traction. Edits applied: - HEADER positioning paragraph — retracted "structurally unique"; named the cohort explicitly (srclight, Sverklo, ctxpp, KotaDB, codemogger, etc.); spelled out the three differentiation axes; added the parser-choice rationale (oxc + lightningcss as the structural enabler of axis 3). - § 3 moat-intro line — replaced "the only SQL-based code index in the market" with "specific niche in the SQLite-backed-code-index cohort" + the three axes. Reviewer test reframed: eroding either moat turns codemap into "yet-another-tool-in-the-cohort instead of the predicate-shaped specialist." Moats A and B themselves required no rewrite — their justifications (predicate-as-API durability + extracted-structure substrate) hold under the corrected positioning. The peer cohort discovery actually sharpens both moats: A is the specialty (raw SQL surface) and B is the depth axis (richer extraction than tree-sitter cohort). * docs(research): § 1.4 refactor-risk formula — orphan + NULL fixes + caveat Grill-me Q12 outcome: § 1.4's "fan_in × (100 - coverage_pct)" formula had two correctness bugs and one accepted modeling limitation: CORRECTNESS FIXES (must ship): - Orphans (fan_in=0) scored 0 → "no risk" → wrong (orphans are high-risk: dead code or hidden-import targets we don't track). Fix: `fan_in + 1` so orphans score on coverage alone. - NULL coverage_pct propagated through the formula → 100 - NULL = NULL → row dropped from ORDER BY → unmeasured-coverage symbols silently vanished from the ranking. Fix: COALESCE(coverage_pct, 0) treats unmeasured as 0% (high risk). ACCEPTED v1 TRADE-OFF: - Linear-in-fan_in (fan_in 100 with 99% coverage = fan_in 1 with 0% coverage in the score). Real, but not worth fixing in the bundled recipe — users tune via project-local override. Caveat block in refactor-risk-ranking.md (will accompany the recipe when (a) ships) names tuning axes for project-local overrides: - Log-scale fan_in (LOG(fan_in + 1) * 30) for hub-heavy codebases - Visibility weight (if @public / @internal / @beta JSDoc tags are used consistently) - LOC weight (if test-density varies across files) Why ship-with-caveat instead of multi-axis composite (Option B): - Moat A says recipes are saved queries (starting points), not authoritative verdicts. Bundled formula gets 80% right; users iterate. - Anti-bloat meta-rule — every additional axis encodes more opinions; shipping minimal forces explicit thought during tuning. - Ecosystem-specific axes (visibility weight, LOC weight) shouldn't be in the bundled default. Effort stays XS. The .md caveat block lands in the (a) plan PR / impl PR alongside the .sql; not part of THIS research-note PR's scope. * docs(research): § 1.5 boundary violations — Shape A directional rules Grill-me Q13 outcome: § 1.5 was underspecified ("--boundaries <config> flag on audit OR recipe consuming the config"). Three real questions needed answering: where the config lives, what shape, recipe-or-flag. Shape A (directional rules) locked in for v1: boundaries: [ { name: "no-cross-feature", from_glob: "src/features/*/**", to_glob: "src/features/*/**", action: "deny", except_self: true, }, ... ] Why A over B (element-types) over C (layers) — honest discriminator: A and B have IDENTICAL expressiveness (B compiles to A at index time). The real question is ergonomics-at-scale vs forward-compat / smallest- viable-config: - A wins 5 of 6 dimensions: smallest-viable-config (one entry); Zod schema simplest; mental-model load (one concept); forward-compat (B layers on top later as sugar); backwards-compat (never paint into a corner; primitives are durable). - B wins only "ergonomics at scale" (5+ rules with element reuse) — exactly the dimension that can be added later as a sugar layer without breaking A. - C (layer ordering) is most opinionated; only fits layered architectures. Not a v1 default. Decision rule (ship the smallest primitive that doesn't paint into a corner; layer ergonomics on top later) mirrors § 6 Q5 history-table defer logic. Implementation reuses every shipped or in-flight piece of plumbing: - Zod config slot (existing src/config.ts substrate) - Index-time reconciler (mirrors recipe_recency from item 1.9) - New boundary_rules table (moat-B-aligned schema growth) - Bundled recipe boundary-violations.sql via SQLite GLOB operator - SARIF output formatter (already shipped) for CI gate NO new CLI flag — moat-A clean. The verb is query --recipe boundary-violations --format sarif. Recipe consumes config-as-data; SARIF output mode handles verdict-shaped CI consumers. Effort stays S. Element-types / layer sugar deferred to v1.x with explicit "demand-driven" trigger (mirrors fallow.md B.5 verdict- threshold deferral pattern, kept in this doc as the recurring deferral idiom). * docs(research): § 1.1, 1.6, 1.8 sanity sharpening (gotchas + envelopes) Grill-me Q14 outcome: three remaining § 1 rows had implicit gotchas the recipe author would otherwise have to discover during impl. Each row gets a small clarification — substrate unchanged, effort unchanged. § 1.1 components-touching-deprecated: - Was: "One bundled recipe (components-touching-deprecated)" - Now: explicit two-path UNION - HOOK PATH: components.hooks_used JSON overlap with @deprecated symbols (catches deprecated hooks like useDeprecatedThing) - CALL PATH: calls.caller_name IN (SELECT name FROM components) × @deprecated symbols by callee_name (catches regular deprecated functions called inside components) - Hook-only variants would ship false-negatives — recipe author needs the explicit UNION to avoid the trap. § 1.6 unused-type-members: - Was: "Recipe (unused-type-members) — needs JSON-extraction predicate" - Now: ADVISORY recipe with explicit caveat block in .md. Output is "review these" candidates, NEVER "safe to delete" — TS has multiple indirect-usage classes codemap's substrate doesn't track: - Indexed access: T['fieldName'] - keyof T - Type spreads: type X = T & {...} - Mapped types: {[K in keyof T]: ...} These produce false-positives. Recipe is useful as a candidate surfacer; agents must verify before deletion. § 1.8 more MCP resources: - Was: hand-wave "add codemap://files/{path} and codemap://symbols/ {name}" - Now: spell out disambiguation envelope (reuses {matches, disambiguation?} pattern from PR #39 show/snippet) — symbols with duplicate names across files (Component, index, default, util-name collisions) return all matches with by_kind / files / hint metadata. Plus ?in=<path-prefix> query parameter mirroring show --in <path>. - Without spelling this out, the implementation would have to invent disambiguation OR ship a "first match wins" gotcha. Net: each row's What's-needed cell now contains enough detail that the recipe / resource author can implement without re-deriving the JOIN structure or envelope shape. Tactical clarity layered on top of the structural decisions made in earlier grills.

coderabbitai Bot reviewed May 2, 2026

View reviewed changes

Comment thread docs/plans/watch-mode.md Outdated

docs(plan): add bash language to CLI fenced code block (CodeRabbit MD…

0005ea8

…040) Single CodeRabbit comment on PR #46 — markdownlint MD040 caught a bare fence on the CLI surface example. Quick win.

SutuSebastian merged commit 1534d1c into main May 2, 2026
10 checks passed

SutuSebastian deleted the docs/plan-watch-mode branch May 2, 2026 18:40

SutuSebastian mentioned this pull request May 3, 2026

feat(watch): codemap watch — live reindex + mcp/serve --watch killer combo (impl of PR #46 plan) #47

Merged

5 tasks

github-actions Bot mentioned this pull request May 3, 2026

chore: version packages #31

Open

SutuSebastian mentioned this pull request May 3, 2026

docs(research): mark codemap watch (PR #47) shipped + bump fallow snapshot date #48

Merged

2 tasks

SutuSebastian mentioned this pull request May 3, 2026

docs(plan): codemap impact — symbol/file blast-radius walker #49

Merged

2 tasks

SutuSebastian mentioned this pull request May 4, 2026

docs(plan): static coverage ingestion (coverage-final.json → coverage table) #56

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(plan): codemap watch — file-watcher library audit + decision (chokidar v5)#46

docs(plan): codemap watch — file-watcher library audit + decision (chokidar v5)#46
SutuSebastian merged 2 commits intomainfrom
docs/plan-watch-mode

SutuSebastian commented May 2, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

changeset-bot Bot commented May 2, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 2, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

SutuSebastian commented May 2, 2026

Uh oh!

coderabbitai Bot commented May 2, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

SutuSebastian commented May 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in the plan

Behavior change

Test plan

Why ship the plan separately

Summary by CodeRabbit

Uh oh!

changeset-bot Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

SutuSebastian commented May 2, 2026

Uh oh!

coderabbitai Bot commented May 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SutuSebastian commented May 2, 2026 •

edited by coderabbitai Bot

Loading

changeset-bot Bot commented May 2, 2026 •

edited

Loading

coderabbitai Bot commented May 2, 2026 •

edited

Loading