docs(plan): codemap watch — file-watcher library audit + decision (chokidar v5)#46
docs(plan): codemap watch — file-watcher library audit + decision (chokidar v5)#46SutuSebastian merged 2 commits intomainfrom
Conversation
…okidar v5) 6 watchers audited (chokidar, @parcel/watcher, nsfw, watchpack, turbowatch, node:fs.watch) on speed, robustness, OS coverage (macOS/Linux/Windows/WSL), JS runtime coverage (Bun + Node), maintenance, and install footprint. Decision: chokidar v5. Wins on cross-runtime parity (pure JS — no N-API quirks on Bun), 14-year bug-hunt history (~30M repos using it), smallest install footprint of the JS abstractions (82 KB + 1 dep), atomic-write + chunked-write handling out of the box, and ESM-only Node ≥20.19 alignment with codemap's stack. @parcel/watcher kept as a defensible alternative if we ever measure chokidar perf as a bottleneck on >100k-file monorepos. Plan also captures: agent-experience win matrix (today vs with watch mode — eliminates the 'is the index stale?' friction that every CLI/MCP/HTTP query rides on); sketched CLI surface (codemap watch standalone + codemap serve --watch / codemap mcp --watch killer combos + CODEMAP_WATCH=1 env shortcut); 5-tracer build plan with acceptance criteria; explicit out-of-scope (polling fallback, daemon detach, SSE push, multi-host). Per docs-governance Rule 3: plan lives in docs/plans/watch-mode.md, deleted on ship.
|
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughA design document for ChangesWatch-Mode Planning Document
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 55 minutes and 32 seconds.Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/plans/watch-mode.md`:
- Around line 94-100: The fenced code block showing the CLI examples (lines
containing "codemap watch", "codemap serve --watch", and "codemap mcp
--watch") is missing a language specifier; update the opening fence to include a
language (e.g., bash) so syntax highlighting and linters pick it up, ensuring
the rest of the block content remains unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
…040) Single CodeRabbit comment on PR #46 — markdownlint MD040 caught a bare fence on the CLI surface example. Quick win.
…combo (impl of PR #46 plan) (#47) * feat(watch): application/watcher.ts skeleton — pure debouncer + path filter + chokidar backend (Tracer 1 of 5) Per docs/plans/watch-mode.md the engine is split into pure helpers + an injectable backend so tests don't need real fs watches: - shouldIndexPath(relPath, excludeDirNames) — pure predicate. Same indexed extensions as the indexer (.ts/.tsx/.mts/.cts/.js/.jsx/.mjs/.cjs/.css) plus project-local recipes (<root>/.codemap/recipes/<id>.{sql,md}). Hand-rolled path-segment scan (no .split('/') alloc per call — watcher fires on every unrelated edit). - createDebouncer(onFlush, delayMs) — sliding-window timer; trigger() resets, flushNow() forces, reset() clears without firing. Coalesces a burst (git checkout, npm install) into one onChange call. - WatchBackend interface — production wires chokidar v5 (atomic + awaitWriteFinish for chunked-write detection); tests inject a fake backend that drives events deterministically (no flake in CI containers). - runWatchLoop({root, excludeDirNames, onChange, debounceMs?, backend?}) — wires the three together. Returns {stop} that drains the debouncer + closes the backend (so SIGINT loses no events). - DEFAULT_DEBOUNCE_MS = 250 (long enough to coalesce one editor save burst, short enough that agents don't perceive lag). 18 unit tests cover: extension whitelist, exclude-dir-names exact match (not substring — 'distill' is fine even when 'dist' is excluded), recipe paths, .codemap/codemap.db rejected, debouncer sliding window + flushNow + reset, backend dispatch, abs→rel POSIX conversion, dedup within burst, stop flushes pending. chokidar v5 added (1 dep, 82 KB) per docs/plans/watch-mode.md decision. Tracer 2 wires cmd-watch.ts; Tracer 3 wires --watch into serve / mcp. * feat(watch): cmd-watch.ts CLI verb + main/bootstrap wiring (Tracer 2 of 5) Adds the standalone 'codemap watch' command. Wires the engine from Tracer 1: - parseWatchRest() with --debounce <ms> + --quiet flag (12 unit tests cover space/equals forms, validation, defaults, composition, error paths). - printWatchCmdHelp() — explains the --debounce trade-off (lower = snappier, higher = fewer cycles during git checkout / npm install) and points at 'codemap serve --watch' / 'codemap mcp --watch' as the killer combo (Tracer 3). - runWatchCmd() — bootstraps codemap (initCodemap + configureResolver), starts runWatchLoop with onChange = runCodemapIndex({mode: 'files', files: [...paths]}), awaits SIGINT/SIGTERM, drains pending edits before close. - main.ts: dispatch on rest[0] === 'watch'; bootstrap.ts: validateIndexModeArgs accepts 'watch'; printCliUsage lists 'Watch mode' between HTTP server and Targeted reads. Per-batch stderr line: 'codemap watch: reindex N file(s) in Mms' unless --quiet. Smoke verified: 'bun src/index.ts watch' boots, logs the bind line, drains cleanly on SIGTERM. Tracer 3 wires --watch into serve / mcp. * feat(watch): --watch flag on serve + mcp + shared createReindexOnChange helper (Tracer 3 of 5) Killer combo: codemap mcp --watch / codemap serve --watch boots the transport AND a co-process file watcher in one process. Removes the 'is the index stale?' friction agents hit today (per docs/plans/watch-mode.md § Agent-experience win). Factored helper to keep cmd-watch / mcp-server / http-server identical: - application/watcher.ts: createReindexOnChange({quiet, label?}) — opens DB, runs targeted reindex on the changed paths, logs 'reindex N file(s) in Mms' to stderr unless quiet, catches errors so a transient parse failure doesn't kill the loop. Caller passes a label so 'codemap mcp' / 'codemap serve' / 'codemap watch' lines are distinguishable in interleaved logs. - cmd-watch.ts now uses createReindexOnChange (DRY with the embedders). CLI surface: - cmd-mcp.ts: new --watch flag + --debounce <ms> override (default 250). Help text + parser tests + propagation through runMcpCmd → runMcpServer. - cmd-serve.ts: same flags. Parser tests + 'serve: ... (watch: on)' marker on the bind line. - CODEMAP_WATCH=1 / 'true' env shortcut for IDE / CI launches that can't easily edit the agent host's tool spawn command (per the plan's sketched API). Embedder lifecycle: - runMcpServer: starts watcher AFTER server.connect(transport); on shutdown awaits stopWatch() (drains pending reindex) before resolving. - runHttpServer: starts watcher AFTER listen succeeds; on SIGINT/SIGTERM awaits stopWatch() then closes the listener. 146 tests pass (cmd-watch + cmd-mcp + cmd-serve + watcher + mcp-server + http-server). No new code paths in the existing engines — just the boot-time wiring. * feat(watch): handleAudit skips incremental-index prelude when watcher is active (Tracer 4 of 5) Closes the wasted-I/O loop the plan called out: today MCP audit's default behavior is to run an incremental-index prelude (so 'head' reflects the on-disk source) — but with mcp --watch / serve --watch the watcher already keeps the index fresh, so the prelude is pure overhead. - application/watcher.ts: module-level watchActive flag toggled by runWatchLoop start/stop. isWatchActive() exposed for handleAudit; _resetWatchStateForTests + _markWatchActiveForTests for test seam. - application/tool-handlers.ts handleAudit: shouldRunPrelude = !args.no_index && !(isWatchActive() && args.no_index !== false). Hoisted to function scope so the inner finally can also pass it as the readonly hint to closeDb (avoids a wasted checkpoint pass). - Explicit no_index: false still forces the prelude even when watch is on (escape hatch for 'force re-index right now'). - 1 new watcher test + 1 new MCP-server integration test (audit succeeds with no_index unset when watcher is marked active — would have failed if the prelude tried to run on the test's freshly-created DB without git history). 62 watcher + MCP tests pass. * docs: sync README + architecture + glossary + roadmap + agents (Rule 10) + delete plan + changeset (Tracer 5 of 5) - README.md 'Daily commands' stripe: extended with codemap mcp --watch / serve --watch / watch standalone / CODEMAP_WATCH=1 examples. - docs/architecture.md: new 'Watch wiring' paragraph after MCP / HTTP wiring; covers chokidar selection, debounce + filter, audit prelude optimization. application/ table extended with watcher.ts. - docs/glossary.md: new 'codemap watch / watch mode' entry under ## C (alphabetically before 'codemap mcp' / 'codemap serve' since 'watch' < 'serve' but the entry naming convention puts it after the existing CLI verbs). - docs/roadmap.md: 'Watch mode for dev' line removed (shipped per Rule 2). - .agents/rules/codemap.md + templates/agents/rules/codemap.md (Rule 10): new 'Watch mode (live reindex)' table row + --watch / --debounce flags appended to the mcp + serve rows. - .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md: --watch / --debounce + CODEMAP_WATCH semantics on the MCP + HTTP server bullets; new 'Watch mode' bullet covering standalone vs combined shape choice and the audit prelude optimization. - .changeset/codemap-watch.md: minor changeset (new top-level CLI verb + new --watch flag on mcp + serve). - docs/plans/watch-mode.md: deleted on ship per docs-governance Rule 3. - src/{application/{watcher,mcp-server,http-server}.ts, cli/cmd-{mcp,watch}.ts}: replaced dangling cross-refs to the deleted plan with cross-refs to architecture.md § Watch wiring. * fix(watch): drain in-flight + prime gating + onError clears flag + 6 robustness fixes (CodeRabbit on #47) 9 of 10 CodeRabbit threads, all verified ✅ correct. The 10th (#4 — anchor) is⚠️ partial: their suggested anchor (#watch-wiring) doesn't exist; #cli-usage is correct (precedent for every wiring paragraph). Pushing back with evidence in the reply. **Major correctness fixes:** - (#5, heavy) handleAudit treated 'watch active' as 'index definitely fresh' — but the watcher only sees NEW events, not historical drift. On boot before catch-up, audit could read a months-stale index. New onPrime opt on runWatchLoop runs an incremental catch-up BEFORE flipping watchActive=true. Embedders pass createPrimeIndex({label}) — same pattern as createReindexOnChange. Without onPrime, flag flips immediately (test-friendly default). - (#7, heavy) stop() didn't drain async reindex work — fire-and-forget meant a stop() could resolve while onChange was mid-DB-write. Now: serialize onChange via inFlight chain, await it on stop. Also await primingDone so we don't tear down a DB connection out from under the prime catch-up. - (#8) Backend onError left watchActive=true → handleAudit kept skipping prelude even when chokidar died. Now clears the flag. - (#1) http-server: watcher boot throw after listen() leaked the listener. try/catch closes server on failure. - (#2 + #10) http-server + cmd-watch: stopWatch().then(closeServer) never fired closeServer if stopWatch rejected — process hang on SIGTERM. Now .catch(log).finally(...) so progress is guaranteed. - (#3) mcp-server.test: _markWatchActiveForTests ran outside the try guard — a thrown makeClient() would leak the singleton flag into sibling tests. Hoisted into try/finally. **Minor:** - (#6) shouldIndexPath built recipe prefix with platform sep, but relPath is POSIX-normalized. Windows recipe edits got skipped. Fixed to literal '.codemap/recipes/'. - (#9) printWatchCmdHelp lacked JSDoc; added. **Push-back:** - (#4) CodeRabbit suggested changing #cli-usage anchor to #watch-wiring. Verified: there's no '## Watch wiring' heading — the wiring sections are bold-prefix paragraphs under '## CLI usage'. Their fix would 404. Keeping #cli-usage matches the precedent for MCP / HTTP / SARIF / serve wiring paragraphs. 5 new watcher tests cover the prime-gating race, onError flag clear, in-flight drain on stop (both auto-fire and flushNow paths). 152 tests pass total. * docs(watch): outside-diff + 2 nitpicks from CodeRabbit follow-up review on #47 All 3 verified ✅ correct (the inline 10 actionable were already addressed in 207c05d): - (outside-diff, mcp-server.ts:165) audit tool description still claimed prelude always runs first; updated to document watch-mode default ('default true-equivalent without watch, default false-equivalent with --watch active') and the 'pass no_index: false to force a re-index even when watch is active' escape hatch. - (nitpick, glossary.md:395) widened wording from 'codemap mcp audit' to 'audit tool ... on both transports' since the same skip applies to 'codemap serve --watch' POST /tool/audit. - (nitpick, cmd-serve.test.ts:14 + cmd-mcp.test.ts:11) replaced hardcoded debounceMs: 250 with imported DEFAULT_DEBOUNCE_MS so future default changes don't silently break tests.
…pshot date (#48) - fallow.md status snapshot date 2026-05-02 → 2026-05-03; new 'codemap watch' bullet under 'Adjacent shipped' covering the chokidar selection, three shapes, prime-gating + in-flight drain robustness fixes (CodeRabbit caught two heavy bugs on PR #47). - competitive-scan-2026-04.md: 'Watch mode (codemap watch) — still backlog' line replaced with shipped row pointing at PR #47 (impl) + PR #46 (plan). - Anchor cross-ref bumped to match the new fallow.md heading.
…(v1.x backlog) (#51) * docs(plan): codemap audit --base <ref> — worktree + reindex strategy (v1.x backlog) Plan for the next-best agent-value loop: PR-review structural-diff. Replaces today's 3-step --baseline dance with one verb. Reuses 90% of the existing audit infrastructure (PR #33); only new piece is the worktree+reindex snapshot path. Cache-by-resolved-sha; LRU 5/500 MiB; mutual-exclusive with --baseline; per-delta override compatible. Hard error on non-git projects (no graceful fallback — there's no meaningful 'ref' without git). Plan only — implementation follows after CodeRabbit review per the impact (#49→#50) / watch (#46→#47) workflow. * docs(plan): address CodeRabbit findings — atomic cache populate (D11), worktree-as-cache lifecycle clarity, TS type widening callout - D1/D2/D8 rewritten: worktree IS the cache entry (kept until LRU evicts); cleanup runs only on reindex failure rollback OR LRU eviction. The earlier ambiguity (D2 said 'cache by sha' while D8 said 'remove in finally') is resolved. - D11 added: atomic cache populate via per-pid temp dir + POSIX rename → free single-flight semantics; no lock files needed. Same pattern for eviction. Closes the race CodeRabbit flagged on concurrent CI matrix runs against the same sha. - AuditBase TS type widening to discriminated union called out explicitly above the Decisions table (Tracer 1 ships it). - CODEMAP_AUDIT_CACHE_SIZE env var mention dropped — was promising an unimplemented config knob; v1 hardcodes the limits, defer to v1.x+.
…age` table) (#56) * docs(plan): static coverage ingestion (Istanbul JSON → `coverage` table) Plans the C.11 candidate from `research/fallow.md` — `codemap ingest-coverage <path>` reads Istanbul `coverage-final.json` into two new tables (`coverage` symbol-level + `file_coverage` rollup), joinable to `symbols` for the killer "what's structurally dead AND untested?" recipe in one query. Resolves the open question from `fallow.md § 6` ("symbols column vs separate table?") in favour of a separate table with `ON DELETE CASCADE` (D1) — coverage shape evolves independently of structural columns; LEFT JOIN keeps NULL semantics explicit; rows survive `--full` reindex via the `query_baselines` precedent (D6). Key decisions: - Istanbul JSON in v1; LCOV in v1.x; raw V8 traces never (D3, fallow's paid moat). - One-shot `ingest-coverage` verb decoupled from `codemap` index runs (D4) — coverage cadence (per `bun test --coverage`) ≠ index cadence (per file edit). - Statement coverage only in v1 (D5); branch/function deferred until a consumer asks. - MCP/HTTP exposure as a query column, not a separate `coverage` tool (D9) — composes with every existing recipe + ad-hoc SQL. - `codemap audit --delta coverage` deferred to v1.x (D10) — raw schema first. Five-tracer plan: schema bump → engine → CLI verb → fixture + golden recipe → docs. Plan only — implementation follows after CodeRabbit review per the established workflow (PRs #46/47, #49/50, #51/52, #53/54). * docs(plan): fact-check fixes — drop hallucinated SQL/projection/runner claims Self-audit against the actual codebase surfaced four claims that didn't hold: 1. Killer recipe SQL referenced `callee_id` — `calls` is name-keyed (`callee_name TEXT`, no symbol-id FK; see `db.ts` `CallRow`). Rewrote the "no callers" predicate as `NOT EXISTS (… WHERE callee_name = s.name)`. 2. D7 claimed line-range projection is "the same `markers` already uses" — `markers` is line-pinned (`line_number INTEGER`), no projection. Reworded as "novel for this plan" with the actual mechanic spelled out. 3. D3 listed `bun test --coverage` as an Istanbul JSON emitter — `bun test --help` shows only `text` / `lcov` reporters today. Removed bun from the Istanbul-emitters list; left vitest/jest/c8/nyc with the explicit reporter flags they need. 4. D12 contradicted D6 ("rows absent until re-ingest" vs "rows survive `--full`"). Reconciled: empty is the correct initial state on first bump; subsequent bumps preserve via the `dropAll()` exclusion. Quoted the `lessons.md` policy verbatim instead of paraphrasing. * docs(plan): v2 — fix CASCADE hazard + innermost-wins projection + nits Self-grilling found two real schema design holes that would block execution: 1. **D6 CASCADE hazard.** Original draft keyed `coverage` on `symbol_id REFERENCES symbols(id) ON DELETE CASCADE`. Every `--full` reindex calls `dropAll()` → drops `symbols` → CASCADE wipes coverage, regardless of whether `coverage` itself was excluded from `dropAll()`. Recreated `symbols` get fresh auto-increment IDs anyway → coverage permanently lost without re-ingest. Fix: natural-key PK `(file_path, name, line_start)` — no FK to `symbols.id`. Survives the `symbols` drop-recreate cycle. Trade-off: orphan rows when files are deleted; cleaned by one explicit `DELETE FROM coverage WHERE file_path NOT IN (SELECT path FROM files)` after every ingest. 2. **D7 overlapping symbols.** Original draft: `line_start ≤ stmt_line ≤ line_end` matches every enclosing scope. With nested symbols (class methods inside classes, closures inside functions), one Istanbul statement projects onto 3+ symbols, inflating `total_statements` 2-3×. Fix: innermost-wins via `(line_end - line_start) ASC LIMIT 1`. New `skipped.statements_no_symbol` counter for statements that fall outside every symbol range (top-level expressions, side-effect imports). Nits cleared in the same pass: - D2: drop `file_coverage` rollup table from v1 (aggregateable via GROUP BY on the symbol-level table; doubling sources of truth without a benchmark is premature). Promote to v1.x with a real query. - D11: spec the `total_statements = 0 → coverage_pct IS NULL` edge case + document the cross-file name-collision lossiness in the killer recipe. - Drop `--prune` flag (orphan cleanup is unconditional, no flag needed). - Drop per-row `source` column (single meta key sufficient; one ingest at a time). - Update killer recipe SQL to use the natural-key 3-column join. - Drop made-up "~50 LoC LCOV ingester" estimate and "<50 ms / <1 ms / ~500 KB" performance numbers (no benchmark backed them). - Tracer 1 / 2 / 3 acceptance criteria updated to match the new schema. Plan is now ready for tracer-1 implementation. CodeRabbit pass deferred (rate-limited 57m). * docs(plan): tighten Bun-native API references (file read + perf note) Plan correctly inherits the established Node vs Bun runtime split, but the single tracer-3 reference understated it. Now: - Tracer 3 cites `packaging.md § Node vs Bun` as the canonical pattern source instead of pointing at config.ts in passing. - Performance section calls out the actual lever — `Bun.file(path).json()` uses Bun's native JSON parser, materially faster than V8 `JSON.parse` on multi-MB Istanbul payloads (real coverage files for medium codebases routinely hit several MB). No new Bun-native API surfaces are added — the feature doesn't need globbing, file writes, spawn, or hashing beyond what the existing engines already use through their abstractions. * docs(plan): v3 — ship LCOV in v1 + drop --source flag + bundle killer recipe The "fully capable, no half-way APIs" principle reshapes three things: 1. **LCOV ingester ships in v1** alongside Istanbul. Original draft deferred LCOV to v1.x, which would exclude `bun test --coverage` users — i.e. codemap's own primary runtime. That's the textbook half-baked surface the principle bans. Two parser front-ends share one `upsertCoverageRows` core; LCOV is regex tokenizing over `SF:` / `DA:` / `end_of_record`. Tracer 2 splits into 2a (shared core + Istanbul parser) and 2b (LCOV parser), both writing identical normalised CoverageRow[] into the same upsert path. 2. **`--source istanbul|lcov` flag dropped.** Auto-detection from extension (`.json` → istanbul, `.info` → lcov, directory → probe both, error on ambiguous) is unambiguous; a flag for "tell codemap what it can already see" is API noise. Misnamed files can be renamed (one-liner) cheaper than codemap can grow a flag. 3. **Killer recipe ships as bundled `untested-and-dead.{sql,md}`** in `templates/recipes/`. Per the recipes-as-content registry (PR #37), the high-value queries become first-class agent surface. A buried doc snippet would be invisible to agents at session start; the bundled recipe shows up in `--recipes-json` and gets a `codemap query --recipe untested-and-dead` direct invocation. Tracer 4 also fans out: Istanbul + LCOV fixtures cover the same partial coverage shape; three golden recipes (`coverage-istanbul.json`, `coverage-lcov.json`, `untested-and-dead.json`) prove format equivalence. Out-of-scope, alternatives, performance section, title, and goal statement all updated to match. * docs(plan): v4 — agent-journey audit + bundled recipe shelf (D13) Walked every D / OOS / tracer item against "fully capable + agent first-class + no half-baked APIs". Found three half-baked surfaces: 1. **D2 deferral leaks "compose GROUP BY yourself" onto the agent.** Deferring the `file_coverage` table is correct (no benchmark proves it's needed) — but the agent-facing answer for "rank files by coverage" was missing. Fix: keep table deferral, ship a bundled `files-by-coverage.{sql,md}` recipe so the GROUP BY view IS first-class. 2. **D11 name-collision lossiness was acknowledged but unmitigated.** The killer recipe's `callee_name = s.name` cross-file lossiness was documented in the recipe SQL comment, but the recipe `.md` didn't give the agent any narrowing pattern. Now D11 ships three concrete narrowing patterns in the `.md` (file_path scope, default- export filter, exported-only restriction) so the agent has workable mitigations on day one. 3. **Missing recipe shelf for common agent questions.** Walking the journey: only "What's structurally dead AND untested?" had a recipe; "Rank files by coverage" and "Worst-covered exported symbols" forced ad-hoc SQL. Three recipes fully cover the agent journey end-to-end. New D13 codifies the bundled-recipe principle: every common agent question gets a `--recipe` verb. Three v1 recipes: - `untested-and-dead.{sql,md}` (killer, with name-collision mitigations) - `files-by-coverage.{sql,md}` (replaces D2's table deferral) - `worst-covered-exports.{sql,md}` (top-N agent ask) Each `.md` carries a frontmatter `actions` block (per PR #26) so agents get per-row follow-up hints. All three appear in `--recipes-json` automatically — agents discover them at session start. New "Agent journey" section makes the principle visible: a table mapping every common agent question to the v1 verb that answers it. If a row ever shows "compose SQL yourself" without a recipe, the surface is half-baked and needs a recipe before tracer 1 ships. Tracer 4 expanded: ships all three recipes + five golden snapshots (adds files-by-coverage.json + worst-covered-exports.json on top of the three existing). Tracer 5 expanded: glossary + agent rule trigger table gain three new rows. Plan now passes the principle audit end-to-end.
…plugin scope) Grill-me Q6 outcome (and accounting cleanup): three of five § 6 open questions are now resolved by prior grill outcomes — § 6 needs to reflect that, not pretend they're still open. Resolutions captured: - Q1 (daemon-default for mcp/serve) — RESOLVED THIS GRILL TURN. Default --watch ON for both modes; opt-out via --no-watch / CODEMAP_WATCH=0. One-shot CLI defaults preserved (no watcher on query/show/snippet). Receipts: stale-index = #1 agent UX complaint (fallow.md § 6); chokidar lazy startup validated tiny by PR #46 6-watcher audit. Flip is a small follow-up PR (flag default + test + patch changeset + agent rule update per docs/README.md Rule 10). AST-caching measurement parked downstream of the flip. - Q3 (LSP shim vs standalone) — RESOLVED in § 2.5 reframe earlier this grill (commit 0b9d878). Thin shim wrapping shipped engines; no engine (would duplicate moat B substrate). Standalone deferred to "if VSCode-extension demand emerges." - Q4 (C.9 plugin contract scope) — RESOLVED via § 5 (b) plan-PR pre-locked decisions (commit 6f845ba). Entry-point hints only for v1; arbitrary edge injection deferred to v2. Static config only per § 3 ergonomic "no JS exec at index time" floor. § 6 restructured: "Resolved (2026-05)" subsection at top with full rationale + receipts; "Still open" subsection below with Q2 (FTS5 default) and Q5 (history table) — the only two genuinely-open questions left. § 2.4 verdict updated to point at the resolved § 6 Q1 anchor instead of the open-question wording. Anchor preservation: external links (#6-open-questions) still resolve to the section heading. New internal anchor (#resolved-2026-05) used by § 2.4 verdict — single inbound link, no external citations to break.
User reframe: codemap is the only SQL-based code index in the market;
inspiration comes from the free and open internet (LSP spec, SQLite
docs, AST tooling), not code-by-code cloning of any peer tool. Drop
fallow as a yardstick throughout.
Vital information preserved (per "don't lose any vital information
that is used to execute the plan"):
- Closed-dead-subgraph motivator for C.9 — kept as an abstract pattern
description in § 2.3 caveat (N-file packs with self-imports, non-
zero fan-in, none reachable from real entry). Was previously cited
to fallow.md § 0; now stands on its own merit.
- LSP read-side capabilities (show / impact / watch) — kept; LSP spec
upstream is now the protocol authority instead of fallow's
crates/lsp/.
- Runtime-tracing scope distinction — § 3 floor reframed to anchor on
"different product class entirely" (live process data vs static
analysis) instead of "fallow's paid moat."
- Predicate-as-API moat (A) — kept; justification now anchors on
intrinsic merit (SQL is durable, agents compose any predicate)
rather than "fallow ships verdicts; we don't."
- Schema-breadth moat (B) — kept; justification now "codemap-specific
extractions; their richness directly determines what JOINs are
expressible" rather than "fallow has none of these."
Section-by-section changes:
- HEADER — "Companion docs / Source for deep-dives" replaced with
"Companion doc" (competitive-scan only) + "Positioning" paragraph
declaring structural uniqueness.
- § 2.3 original-framing quote — paraphrased to drop the "(e.g.
fallow, knip, jscpd)" parenthetical; pointers to roadmap.md for the
full original wording. (roadmap.md itself still has the parenthetical;
separate-PR scope.)
- § 2.3 caveat — closed-dead-subgraph case described abstractly; no
source citation needed.
- § 2.5 LSP shim — "fallow has crates/lsp/" → "LSP spec upstream is
the protocol authority."
- § 3 intro — mission framing rewritten; "equal/surpass fallow"
language replaced with "extract maximum value from the SQL-index
architecture; grow the ecosystem" + "only SQL-based code index in
the market" positioning.
- § 3 Moat A — anchored on intrinsic merit (SQL durable + agent
composability) instead of fallow comparison.
- § 3 Moat B — anchored on "substrate every recipe layers on; richness
determines JOIN expressivity" instead of "fallow has none of these."
- § 3 ergonomic floors — dropped all "fallow is also fast" /
"Convergent with fallow" annotations; reframed runtime-tracing as
"different product class entirely (live process data, not static
analysis)" + reframed telemetry-upload as standalone safety promise.
- § 4 — DELETED ENTIRELY ("What to inspect in the fallow source
tree"). Replaced with "Inspiration sources for plan-PR authoring"
table listing open specs / primitive sources only (LSP spec, SQLite
docs, oxc node reference, Lightning CSS, JSON-RPC + MCP spec, TC39
proposals, existing codemap surface, internal third-party graph
audits). Discipline statement preserved: every plan PR cites the
spec / primitive source it took inspiration from.
- § 5 (d) row + T-table T+5w → +7w cell — dropped fallow crates/lsp/
refs; LSP spec is now the named authority.
- § 6 Q1 — dropped fallow.md § 6 citation; stale-index frequency now
anchored on PR #46 + PR #56 internal evidence.
- § 6 Q4 — dropped fallow.md § 0 + § 6 citations; closed-dead-subgraph
case cross-refs § 2.3 caveat instead.
- § 7 cross-references — removed research/fallow.md and fallow
upstream entries. Added § 4 inspection list as a self-reference.
- § 8 errata § 2.3 row — dropped fallow.md citation; pattern described
inline.
Net effect: the doc stands on codemap's intrinsic structural
properties. No peer-tool framing remains. The mission is now
self-coherent: extract max value from the SQL-index architecture +
grow the ecosystem, anchored on the unique-in-market positioning.
…quence (2026-05) (#58) * docs(research): non-goals reassessment + fallow clone deep-dive map (2026-05) Companion to research/fallow.md (capability tracker — what to adopt FROM fallow). This new doc inventories what THIS codebase already unlocks that the current Non-goals (v1) list forbids, post-C.11. User observation: many non-goals were defensive choices made when the project was 1/10th its current size, then carried forward unchallenged as the surface grew (15+ recipes, 12+ tables, 3 engines, watch mode, coverage, audit, impact). The reframe: stop asking "what should we not do?" and start asking "what does the SQL-index-with-three-transports actually unlock that no other tool does?" Findings: §1 — 10 first-class agent capabilities sitting in unwritten JOINs / formatters / verbs (components-touching-deprecated, unimported-exports, complexity per symbol, refactor-risk-ranking, boundary violations, unused type members, Mermaid output, MCP file/symbol resources, recipe usage telemetry, rename --dry-run preview). §2 — Five non-goals worth challenging: - "No FTS5 / use ripgrep" — SQLite ships FTS5; ripgrep loses JOIN composition (TODOs inside @deprecated functions in <50% covered files is one query, vs three tools today). - "No visualisation" — conflates rendering pixels with shaping render- ready data; Mermaid / D2 are JSON-shaped formatters (sibling of SARIF). - "No static analysis" — we already ship deprecated-symbols, untested- and-dead, barrel-files, fan-in/out; the line was rhetorical. Real boundary is "no opinionated rule engine, no fix mutation". - "No persistent daemon" — we have one (mcp --watch, serve --watch, watch); non-goal preserves a constraint that no longer exists. - "No LSP replacement" — show + impact + watch is 80% of LSP read-side; ship a thin shim consuming existing engines, don't write an LSP. §3 — Real architectural limits worth keeping (sub-100ms cold-start CLI, no LLM in box, no fix engine, no runtime tracing, no JS exec at index time). §4 — Map of /Users/sutusebastian/Developer/OSS/fallow clone deep-dive points: which crates / docs / configs to inspect before each shipped feature so we adopt patterns rather than reinvent. Cite-the-source-path discipline mirrors the existing research/fallow.md cite-the-PR habit. §5 — Recommended sequence: (a) FTS5 + Mermaid one-PR non-goal flip → (c) complexity column → (b) C.9 plugin layer (multi-tracer big surface) → (d) LSP shim. (a) is the cheapest non-goal flip; ships a confidence move before the bigger surfaces. §6 — 5 open questions (daemon-by-default for MCP/HTTP, FTS5 opt-in, LSP shim vs standalone, plugin contract scope, history table shape). Doc-governance compliance: - Goes in docs/research/ per Rule 3 (research-class doc). - Cross-references roadmap, why-codemap, fallow.md, competitive-scan per Rule 5. - Doesn't duplicate non-goals (Rule 1) — proposes amendments to be applied when § 2 items ship, in lockstep with why-codemap per the Single source of truth table. - No inventory counts in narrative (Rule 6) — uses qualitative "15+ recipes / 12+ tables" only. * docs(research): triangulate non-goals reassessment vs descriptive baseline User cross-checked my prescriptive doc (non-goals-reassessment-2026-05.md) against composer-2-fast's descriptive baseline (codemap-capability- surface-2026-05.md) plus the codebase as source of truth. Found three factual errors in mine; baseline doc held up clean. Corrections applied: 1. § 1.2 (Exports never imported): codebase has `exports.re_export_source` column — original doc missed it. Re-exports require a JOIN through that column to avoid false positives on barrel-only exports. Effort bumped XS → S. 2. § 1.3 (Cyclomatic complexity): claimed "AST walker already counts nodes during parse" — false. `rg 'complexity|node_count|nodeCount' src/` returns zero matches. Node-counting is NOT in place; needs an extension to the AST walker in src/parser.ts. Effort bumped S → M. 3. § 2.3 ("no static analysis" non-goal): listed `fan-in` and `fan-out` as "static analysis we already ship" — too loose. Per `fan-in.sql` (`ORDER BY fan_in DESC LIMIT 15`) they're hotspot rankers, not orphan / dead-code detectors. They don't cover the closed-dead- subgraph case from research/fallow.md § 0 (8-file pack with non- zero fan-in via self-import). That gap motivates C.9 framework plugin layer, not the "no static analysis" flip. Caveat now spelled out in the doc. Header updated: this doc is the **prescriptive** lens; the **descriptive baseline** lives in codemap-capability-surface-2026-05.md (read first). Cross-references list and § 8 errata block document the diff between v1 and v2 so future reviewers can see what changed and why. Process lesson encoded in § 8: every prescriptive research note should triangulate against a descriptive baseline (own doc or peer model) before recommending a ship sequence. Caught all three errors before they propagated into a plan PR. * docs(research): scrub local user paths from non-goals doc + new lesson User caught absolute-path leaks in the research note pointing at the fallow clone on the maintainer's machine. Three references replaced with the public upstream URL (https://github.com/fallow-rs/fallow): - Header "Local clone for deep-dives" → "Source for deep-dives" - § 4 heading "What to inspect in the local fallow clone" → "...in the fallow source tree" - § 7 cross-references "Local fallow clone — /Users/..." → "fallow upstream" Also adds a new general-purpose lesson to .agents/lessons.md: Never commit absolute local user paths — no /Users/<name>/…, /home/<name>/…, ~/…, or file:/// URIs in any tracked doc, code, comment, or PR body. Pattern: cite https://github.com/<org>/<repo> for upstream sources; repo-relative paths for in-tree references. Sibling to the existing "PR bodies via temp file" lesson — same family (committed strings need to be portable + non-leaking), different surface. * docs(lessons): add 'never commit local user paths' lesson (PR #58 catch) * docs(research): delete codemap-capability-surface-2026-05.md (existence test) Per docs/README.md existence test, this doc fails 3 of 4 criteria: - ❌ Doesn't document durable policy unavailable elsewhere — every fact reproducible from db.ts / builtin.ts / audit-engine.ts / --recipes-json - ❌ Doesn't track open work — pure snapshot - ❌ No unique historical context git log + architecture.md can't reconstruct - ✅ Cited by another doc (only because non-goals-reassessment cited it) Plus Rule 1 violation (duplicates architecture.md § Schema) and Rule 6 violation (hardcodes "15 recipes" / "9 of 15 ship actions" inventory counts in narrative). The real value the doc delivered was the **triangulation discipline** — catching 3 errors in non-goals-reassessment v1. That discipline is the durable artifact, not the doc. Codified in two places: 1. non-goals-reassessment § 8 errata + process lesson (kept) 2. .agents/lessons.md — new lesson explicitly bans the "dual descriptive + prescriptive doc" pattern as a Rule 1 violation. Right discipline: pin every concrete claim in the prescriptive doc itself, or self-audit against the canonical home before committing. Don't ship a parallel descriptive doc. non-goals-reassessment header + § 7 + § 8 updated to drop the now-deleted companion-doc references and point at canonical sources directly (architecture.md § Schema, db.ts, builtin.ts, audit-engine.ts V1_DELTAS). * docs(research): align § 5 (c) effort with § 1.3 / § 8 (M, not S) CodeRabbit caught § 5 row (c) "Cyclomatic complexity column" listing effort S, while § 1.3 + § 8 errata both list M (the v1→v2 bump after `rg 'complexity|node_count|nodeCount' src/` returned zero — node- counting isn't already in place; the AST walker in src/parser.ts has to be extended). Effort propagation gap from the v2 errata pass. § 5 row (c) updated to M; "Why" cell now spells out the AST-walker dependency inline so future readers don't re-litigate the figure. * docs(research): split § 3 into moat (load-bearing) vs ergonomic limits Grill-me Q1 outcome (under "extract max from SQL-index + equal/surpass fallow" mission): the original § 3 list conflated ergonomic floors (sub-100ms cold-start, no LLM, no JS at index time) with the actual moats. Most of the original entries are floors fallow also follows; they're not differentiators. The two real moats that needed naming as load-bearing limits: A. SQL is the API — every capability is a recipe (saved query) or a primitive recipes can compose. Verdicts are an OUTPUT mode (--format sarif, audit deltas), never a primitive. Reviewer test: "is this verb also expressible as query --recipe <id>?" B. Extracted structure ≥ verdicts — schema breadth (CSS, markers, type_members, calls.caller_scope, components.hooks_used) is what equals/surpasses fallow on agent-facing capability per fallow.md § 5. Reviewer test for any "drop column X" PR: "what recipe (bundled or hypothetical) does this kill?" Both are now load-bearing rows above the ergonomic ones. The original five preferences are kept verbatim but annotated with their relation to the moat (floor / convergent / adjacent / rivalrous / safety). Eroding either A or B is the most likely path from "codemap" to "fallow with extra steps" — § 3 now equips a reviewer to spot it. * docs(research): § 5 ship sequence — parallel plan-PR for (b) at T+0 Grill-me Q2 outcome (under "equal/surpass fallow" mission): the "cheapest non-goal flip first" ordering was a small-team confidence move, but the § 3 moat rewrite already paid that confidence cost. The real risk under the actual mission is the deferral trap — XL items become "next quarter" while every new recipe inherits the noisy substrate (untested-and-dead's Next.js page.tsx false-positive class). Hybrid resolved: - Shipping cadence stays (a) → (c) → (b) impl → (d). - (b) plan PR opens at T+0, iterates in parallel during (a)+(c). - Plan opens with ~30% of decisions pre-locked: entry-point hints only per Grill Q4, static config only per § 3 "no JS exec at index time" ergonomic limit. Not a blank-slate plan — structured from day 1. Added a 5-row T-table in § 5 spelling out the parallel tracks. (b)'s "Why" cell now names the deferral trap explicitly; (d)'s "Why" pins its dep on (b) impl (not just (b)). Rationale list updated to flag that the moat rewrite paid the confidence move so (a) doesn't pay it again. Cost-if-abandoned escape hatch: plan PR can close as "Status: Rejected (YYYY-MM-DD)" per docs/README.md Rule 8. Design surface captured either way. * docs(research): § 2 reframed via § 3 moats (taxonomy + verdict cross-refs) Grill-me Q3 outcome: § 2's five flips inherited their shape from "original non-goals worth challenging" — but after § 3 locked in the moats, that shape conflated three different categories: - Moat-extending flips (2.1 FTS5, 2.3 static analysis) — substrate growth inside moat B - Moat-aligned flip (2.2 output formatters) — verdicts as output mode per moat A - Moat-orthogonal transport flips (2.4 daemon, 2.5 LSP shim) — neither moat is touched; flipping just re-exposes existing substrate Anchors preserved (2.1-2.5 stay) — anchor-preservation discipline per docs-governance § 3 / docs/README.md Rule 7. No cascading link updates needed in § 3 / § 4 / § 5 / § 8. Changes per section: - § 2 header — added a reading note naming the three categories and pointing each flip at the moat row it relates to. - § 2.3 — verdict no longer restates "no opinionated rule engine + no fix engine" (now canonical in § 3 moat A + ergonomic row); instead cross-references and names the static-analysis category as in-scope. Closed-dead-subgraph caveat preserved (it's the C.9 motivator). - § 2.4 — added "Moat relation: orthogonal" subsection naming the transport / process-model framing. AST-caching capability claim preserved + cross-linked to § 6 Q1. Verdict points the daemon-default question at § 6 Q1 explicitly (single canonical home). - § 2.5 — replaced the unmeasured "80% of LSP read-side" claim with a structural argument: shim wraps shipped engines (show / impact / watch) via stdio without re-extracting structure; an LSP *engine* would duplicate moat B substrate (the actual reason not to build one). Cited application/show-engine.ts + application/impact-engine.ts as the substrate the shim wraps. - § 6 Q1 — enriched with the AST-caching downstream measurement note lifted from § 2.4 (single canonical home for the daemon-default decision; § 2.4 cross-refs here). Vital-info preservation audit: - ✅ Closed-dead-subgraph caveat (8-file widget pack via fallow.md § 0) — kept verbatim in § 2.3 caveat block. - ✅ AST-caching capability claim — kept in § 2.4 "Capability unlocked" + cross-linked from § 6 Q1. - ✅ Watch-mode receipts (codemap watch / mcp --watch / serve --watch) — kept verbatim in § 2.4 "What's actually true". - ✅ Fan-in/fan-out hotspot-rankers framing — kept verbatim in § 2.3 caveat (with errata cross-ref to § 8). - ✅ Fallow `crates/lsp/` cross-ref — kept in § 2.5. Dropped (intentional): - "80% of LSP read-side" — unmeasured; replaced with structural argument that doesn't need a measurement. * docs(research): § 1.7 Mermaid — bounded-input contract (moat A) Grill-me Q4 outcome: § 1.7's "What's needed" cell was loose ("new --format mermaid formatter") — true but underspecified. Real-project edge counts on dependencies / calls are 1k-10k+; rendering them is either Mermaid-choking or a hairball, and silently auto-truncating (or "best-effort") would be a verdict-shaped affordance masquerading as an output mode — violates moat A. Locked in: - Allow on: impact engine output (depth-bounded), LIMIT N-shipped recipes (fan-in / fan-out), ad-hoc SQL with explicit LIMIT ≤ 50. - Reject (with scope-suggestion message) on unbounded inputs. - No auto-truncation — that's a verdict (recipe author's job to scope). Threshold (50 edges) is configurable; chosen as a default-readable upper bound for chat-client rendering. Calibrate during (a) impl PR against fixtures/golden / external corpus. DX framing: hairballed Mermaid in MCP / Cursor / Slack chat clients renders as garbage; a clear error naming knobs (LIMIT / --via / WHERE from_path LIKE) is the better consumer signal. This keeps Mermaid an output mode (moat A clean) and forces recipe authors to scope graphs — correct because they own the structural meaning of the result set. * docs(research): § 1.10 rename — recipe-shape (moat A) + parametrised recipes Grill-me Q5 outcome: § 1.10's verb-shape ("codemap rename <old> <new> --dry-run") was downstream of the OLD § 3 ("no fix engine" as a top- level non-goal). After the moat reframe, the actual test is moat A: verdict-shape vs recipe-shape. Verb hides every implicit rename choice (visibility filter, type-only re-exports, test files, aliases) inside argv parsing — not auditable. Recipe-shape puts those choices in reviewable SQL. Locked in: - Bundled recipe rename-preview.sql with --params key=value substitution (?-placeholder binding via db.ts prepared statements). - --format diff output mode (sibling of --format mermaid per item 1.7; same "rows in, renderable text out" pattern). - No new verb / engine / MCP tool / HTTP route. SQL stays the API. - Effort drops M → S. Cross-cutting infrastructure unlocked: parametrised recipes is net-new plumbing but pays for itself on the first downstream use. Already- visible follow-ons captured in the new "Cross-cutting infrastructure unlocked by item 1.10" paragraph at the end of § 1: - delete-symbol-preview, extract-function-preview, inline-symbol- preview — same recipe-shape pattern; all gated on the same plumbing. - Parametrising existing static recipes (untested-and-dead --params min_coverage=80 instead of hardcoded < 80) — cleanup opportunity the same plumbing enables. This is the second moat-A demonstration in two adjacent grill rounds (after § 1.7's bounded-input contract on Mermaid). Both prove the "verdicts are output mode, recipes are the API" framing on real capabilities — exactly what the (a) plan-PR will need to point at when reviewers ask "what changed?". * docs(research): § 6 — close Q1 (daemon-default), Q3 (LSP shape), Q4 (plugin scope) Grill-me Q6 outcome (and accounting cleanup): three of five § 6 open questions are now resolved by prior grill outcomes — § 6 needs to reflect that, not pretend they're still open. Resolutions captured: - Q1 (daemon-default for mcp/serve) — RESOLVED THIS GRILL TURN. Default --watch ON for both modes; opt-out via --no-watch / CODEMAP_WATCH=0. One-shot CLI defaults preserved (no watcher on query/show/snippet). Receipts: stale-index = #1 agent UX complaint (fallow.md § 6); chokidar lazy startup validated tiny by PR #46 6-watcher audit. Flip is a small follow-up PR (flag default + test + patch changeset + agent rule update per docs/README.md Rule 10). AST-caching measurement parked downstream of the flip. - Q3 (LSP shim vs standalone) — RESOLVED in § 2.5 reframe earlier this grill (commit 0b9d878). Thin shim wrapping shipped engines; no engine (would duplicate moat B substrate). Standalone deferred to "if VSCode-extension demand emerges." - Q4 (C.9 plugin contract scope) — RESOLVED via § 5 (b) plan-PR pre-locked decisions (commit 6f845ba). Entry-point hints only for v1; arbitrary edge injection deferred to v2. Static config only per § 3 ergonomic "no JS exec at index time" floor. § 6 restructured: "Resolved (2026-05)" subsection at top with full rationale + receipts; "Still open" subsection below with Q2 (FTS5 default) and Q5 (history table) — the only two genuinely-open questions left. § 2.4 verdict updated to point at the resolved § 6 Q1 anchor instead of the open-question wording. Anchor preservation: external links (#6-open-questions) still resolve to the section heading. New internal anchor (#resolved-2026-05) used by § 2.4 verdict — single inbound link, no external citations to break. * docs(research): § 6 Q2 closed — FTS5 default-OFF, both config + CLI Grill-me Q7 outcome: § 6 Q2 (FTS5 opt-in vs default-on) resolved. Locked in: - Toggle: BOTH codemap.config.ts `fts5: true` AND --with-fts CLI flag at index time. Config-only forces CI / ephemeral workflows to commit fts5: true to a config file; CLI-only forces long-term users to remember the flag on every --full. Cheap to support both. - Default: OFF. Backwards-compat — existing users wouldn't see .codemap/index.db grow ~30-50% silently on next --full. - Re-evaluate default in v2 once external-corpus size measurements land (bun run benchmark:query shape). Bug fix in § 2.1: the "off by default to keep cold-start sub-100ms" framing was a WRONG REASON. FTS5 is index-time cost only; cold-start reads existing DB and the virtual table doesn't slow startup. Real reason for default-OFF is index size growth. § 2.1 verdict updated to reflect this; § 6 Q2 resolution explicitly calls out the wrong-reason correction so future readers see the diff. Principle pinned: default-ON is reserved for capabilities without disk-size tax (Mermaid output, parametrised recipes, complexity column). FTS5 is the disk-tax exception. Tree state after this commit: - § 6 Q1 (daemon-default) — resolved - § 6 Q2 (FTS5 default) — resolved - § 6 Q3 (LSP shape) — resolved - § 6 Q4 (plugin scope) — resolved - § 6 Q5 (history table) — STILL OPEN (defer-bias confirmed by doc) * docs(research): § 6 Q5 closed — history table deferred + full grill findings Grill-me Q8 outcome: § 6 Q5 (history table) resolved as DEFERRED, with the full grill analysis preserved inline so the next reviewer doesn't have to re-derive why we said no. Findings captured: - WHAT it would do — point-in-time index gains a temporal dimension ("when did symbol X get @deprecated?", "coverage trend over 50 commits", "files that became dead this week"). - WHAT audit --base <ref> already covers — pairwise diff serves the most-common temporal question (PR-scoped delta) with no schema growth. Longitudinal "evolved over commits 1..N" is the unfilled gap. - TWO SHAPES table — per-commit snapshots (~25 GB on 500-commit retention; trivial query cost) vs append-only event log (~5-25 MB deltas; heavy recursive-CTE query cost). - BACKFILL COST — N reindexes (~30s each = ~4 hrs first-run for 500 commits) is the same for both shapes; deal-breaker today. - ARCHITECTURE IMPACT — schema bump (minor per pre-v1 lesson), db.ts + indexer hooks, retention policy config, deeper git integration. - WHY DEFER — anti-bloat meta-rule (no recipe demands it); audit --base covers common case; backfill prohibitive without paying use case; shape-decision wasted without empirical access patterns. - REVISIT TRIGGERS — TWO consumers shipping jq-based "audit runs over time" workflows (mirrors B.5 verdict-threshold deferral pattern), OR query_baselines evolution becoming a recurring agent need. The full analysis is now inline in § 6 Q5 (~30 lines + cost table). Per user request: don't lose vital information; document grilling findings for fuller context. Future reviewers see the full reasoning, not just "deferred" — same posture as § 8 errata's "future readers can see the diff between v1 and v2." § 6 status after this commit: ALL FIVE OPEN QUESTIONS RESOLVED. Q1 (daemon-default), Q2 (FTS5 default), Q3 (LSP shape), Q4 (plugin scope), Q5 (history table) — every decision the doc was authored to force is now pinned with rationale and revisit triggers (where applicable). * docs(research): § 1.9 reframe + § 3 "No telemetry upload" floor Grill-me Q9 outcome: § 1.9's "Recipe usage telemetry" framing was a gotcha. The word "telemetry" carries upload / aggregation / surveillance connotations that don't match the actual capability (purely local recency tracking) — and would either get the feature rejected sight-unseen by privacy-conscious users / corp installations OR silently set up substrate for a future "phone home" PR without an explicit non-goal saying we won't. Renamed + tightened § 1.9: - "Recipe usage telemetry" → "Local recipe-recency tracking". - Table renamed recipe_usage → recipe_recency (named after the value, not the act). - Added 90-day retention bound (caps unbounded growth via per-reindex pruning). - Added opt-out config (`recipe_recency: false` skips the reconciler). - --recipes-json surface spec'd: {recipe_id, last_run_at, run_count_90d}. - Naming-note paragraph explains why "telemetry" was rejected. New § 3 ergonomic floor row "No telemetry upload": - Locks in the privacy posture explicitly. No HTTP-out primitive in codebase today (grep-able), but the floor exists to resist accumulation pressure — a future "anonymous opt-in usage stats to help prioritize recipes" PR would look reasonable without an explicit floor. - Convergent with fallow (probably also doesn't upload) — floor, not moat. - Cross-references item 1.9 as the only usage-data feature; consumers can audit the .codemap/index.db location + retention bound. Lockstep update needed when item 1.9 ships: docs/why-codemap.md "What Codemap is not" gains "Codemap never uploads usage data" per docs/README.md Rule 10. Already cross-referenced in § 7 of this doc. * docs(research): drop all fallow framing — codemap is structurally unique User reframe: codemap is the only SQL-based code index in the market; inspiration comes from the free and open internet (LSP spec, SQLite docs, AST tooling), not code-by-code cloning of any peer tool. Drop fallow as a yardstick throughout. Vital information preserved (per "don't lose any vital information that is used to execute the plan"): - Closed-dead-subgraph motivator for C.9 — kept as an abstract pattern description in § 2.3 caveat (N-file packs with self-imports, non- zero fan-in, none reachable from real entry). Was previously cited to fallow.md § 0; now stands on its own merit. - LSP read-side capabilities (show / impact / watch) — kept; LSP spec upstream is now the protocol authority instead of fallow's crates/lsp/. - Runtime-tracing scope distinction — § 3 floor reframed to anchor on "different product class entirely" (live process data vs static analysis) instead of "fallow's paid moat." - Predicate-as-API moat (A) — kept; justification now anchors on intrinsic merit (SQL is durable, agents compose any predicate) rather than "fallow ships verdicts; we don't." - Schema-breadth moat (B) — kept; justification now "codemap-specific extractions; their richness directly determines what JOINs are expressible" rather than "fallow has none of these." Section-by-section changes: - HEADER — "Companion docs / Source for deep-dives" replaced with "Companion doc" (competitive-scan only) + "Positioning" paragraph declaring structural uniqueness. - § 2.3 original-framing quote — paraphrased to drop the "(e.g. fallow, knip, jscpd)" parenthetical; pointers to roadmap.md for the full original wording. (roadmap.md itself still has the parenthetical; separate-PR scope.) - § 2.3 caveat — closed-dead-subgraph case described abstractly; no source citation needed. - § 2.5 LSP shim — "fallow has crates/lsp/" → "LSP spec upstream is the protocol authority." - § 3 intro — mission framing rewritten; "equal/surpass fallow" language replaced with "extract maximum value from the SQL-index architecture; grow the ecosystem" + "only SQL-based code index in the market" positioning. - § 3 Moat A — anchored on intrinsic merit (SQL durable + agent composability) instead of fallow comparison. - § 3 Moat B — anchored on "substrate every recipe layers on; richness determines JOIN expressivity" instead of "fallow has none of these." - § 3 ergonomic floors — dropped all "fallow is also fast" / "Convergent with fallow" annotations; reframed runtime-tracing as "different product class entirely (live process data, not static analysis)" + reframed telemetry-upload as standalone safety promise. - § 4 — DELETED ENTIRELY ("What to inspect in the fallow source tree"). Replaced with "Inspiration sources for plan-PR authoring" table listing open specs / primitive sources only (LSP spec, SQLite docs, oxc node reference, Lightning CSS, JSON-RPC + MCP spec, TC39 proposals, existing codemap surface, internal third-party graph audits). Discipline statement preserved: every plan PR cites the spec / primitive source it took inspiration from. - § 5 (d) row + T-table T+5w → +7w cell — dropped fallow crates/lsp/ refs; LSP spec is now the named authority. - § 6 Q1 — dropped fallow.md § 6 citation; stale-index frequency now anchored on PR #46 + PR #56 internal evidence. - § 6 Q4 — dropped fallow.md § 0 + § 6 citations; closed-dead-subgraph case cross-refs § 2.3 caveat instead. - § 7 cross-references — removed research/fallow.md and fallow upstream entries. Added § 4 inspection list as a self-reference. - § 8 errata § 2.3 row — dropped fallow.md citation; pattern described inline. Net effect: the doc stands on codemap's intrinsic structural properties. No peer-tool framing remains. The mission is now self-coherent: extract max value from the SQL-index architecture + grow the ecosystem, anchored on the unique-in-market positioning. * docs(research): retract uniqueness claim — honest cohort positioning Fact-check finding: the "structurally unique — only SQL-based code index in the market" claim doesn't hold. Web search + verification surfaced a real cohort of SQLite-backed code indexers for AI agents: - srclight (29 stars) — SQLite FTS5 + tree-sitter + embeddings + MCP, 42 tools, 11 langs. Pitch identical to codemap's ("AI agents spend 40-60% tokens on orientation; we eliminate this"). - Sverklo (30 stars) — local-first MCP, symbol graph, blast-radius, open-source alternative to Greptile/Sourcegraph. - ctxpp / ctx++ (17 stars) — Go MCP, tree-sitter, SQLite + FTS + vector, blast-radius analysis (= codemap's impact). - KotaDB (99 stars) — TS + Bun + SQLite — IDENTICAL stack to codemap. - codemogger (2026) — MCP, tree-sitter, SQLite + FTS + vector, semantic search. - @squirrelsoft/code-index, QuickAST, code-scale-mcp, CodeAgent Indexing Engine, Polyglot Indexer MCP, Continue's CodeSnippetsIndex — all SQLite-backed code indexers with overlapping surface. Codemap is one of ~10+, NOT unique. Retracting the claim. Honest differentiation (after verification): 1. Predicate-as-API — peers ship pre-baked verbs / MCP tools; codemap exposes raw SQL + recipes. Genuinely rare in the cohort. 2. Pure structural — no embeddings, no LLM in box. Most peers add vector search by default. Genuine differentiation. 3. JS/TS/CSS-ecosystem-deep extraction — CSS variables/classes/ keyframes, React components.hooks_used, type_members, markers. Peers focus on cross-language symbol+call surface via tree-sitter. The depth axis (3) is structurally enabled by parser choice — oxc (JS/TS) and lightningcss (CSS) are Rust-based and ecosystem- specialized; peers using tree-sitter trade depth for breadth. Where codemap is BEHIND the cohort (not hidden): multi-language support (codemap = TS/JS/CSS only; peers = 10-15 langs), star count, embeddings/semantic search, market traction. Edits applied: - HEADER positioning paragraph — retracted "structurally unique"; named the cohort explicitly (srclight, Sverklo, ctxpp, KotaDB, codemogger, etc.); spelled out the three differentiation axes; added the parser-choice rationale (oxc + lightningcss as the structural enabler of axis 3). - § 3 moat-intro line — replaced "the only SQL-based code index in the market" with "specific niche in the SQLite-backed-code-index cohort" + the three axes. Reviewer test reframed: eroding either moat turns codemap into "yet-another-tool-in-the-cohort instead of the predicate-shaped specialist." Moats A and B themselves required no rewrite — their justifications (predicate-as-API durability + extracted-structure substrate) hold under the corrected positioning. The peer cohort discovery actually sharpens both moats: A is the specialty (raw SQL surface) and B is the depth axis (richer extraction than tree-sitter cohort). * docs(research): § 1.4 refactor-risk formula — orphan + NULL fixes + caveat Grill-me Q12 outcome: § 1.4's "fan_in × (100 - coverage_pct)" formula had two correctness bugs and one accepted modeling limitation: CORRECTNESS FIXES (must ship): - Orphans (fan_in=0) scored 0 → "no risk" → wrong (orphans are high-risk: dead code or hidden-import targets we don't track). Fix: `fan_in + 1` so orphans score on coverage alone. - NULL coverage_pct propagated through the formula → 100 - NULL = NULL → row dropped from ORDER BY → unmeasured-coverage symbols silently vanished from the ranking. Fix: COALESCE(coverage_pct, 0) treats unmeasured as 0% (high risk). ACCEPTED v1 TRADE-OFF: - Linear-in-fan_in (fan_in 100 with 99% coverage = fan_in 1 with 0% coverage in the score). Real, but not worth fixing in the bundled recipe — users tune via project-local override. Caveat block in refactor-risk-ranking.md (will accompany the recipe when (a) ships) names tuning axes for project-local overrides: - Log-scale fan_in (LOG(fan_in + 1) * 30) for hub-heavy codebases - Visibility weight (if @public / @internal / @beta JSDoc tags are used consistently) - LOC weight (if test-density varies across files) Why ship-with-caveat instead of multi-axis composite (Option B): - Moat A says recipes are saved queries (starting points), not authoritative verdicts. Bundled formula gets 80% right; users iterate. - Anti-bloat meta-rule — every additional axis encodes more opinions; shipping minimal forces explicit thought during tuning. - Ecosystem-specific axes (visibility weight, LOC weight) shouldn't be in the bundled default. Effort stays XS. The .md caveat block lands in the (a) plan PR / impl PR alongside the .sql; not part of THIS research-note PR's scope. * docs(research): § 1.5 boundary violations — Shape A directional rules Grill-me Q13 outcome: § 1.5 was underspecified ("--boundaries <config> flag on audit OR recipe consuming the config"). Three real questions needed answering: where the config lives, what shape, recipe-or-flag. Shape A (directional rules) locked in for v1: boundaries: [ { name: "no-cross-feature", from_glob: "src/features/*/**", to_glob: "src/features/*/**", action: "deny", except_self: true, }, ... ] Why A over B (element-types) over C (layers) — honest discriminator: A and B have IDENTICAL expressiveness (B compiles to A at index time). The real question is ergonomics-at-scale vs forward-compat / smallest- viable-config: - A wins 5 of 6 dimensions: smallest-viable-config (one entry); Zod schema simplest; mental-model load (one concept); forward-compat (B layers on top later as sugar); backwards-compat (never paint into a corner; primitives are durable). - B wins only "ergonomics at scale" (5+ rules with element reuse) — exactly the dimension that can be added later as a sugar layer without breaking A. - C (layer ordering) is most opinionated; only fits layered architectures. Not a v1 default. Decision rule (ship the smallest primitive that doesn't paint into a corner; layer ergonomics on top later) mirrors § 6 Q5 history-table defer logic. Implementation reuses every shipped or in-flight piece of plumbing: - Zod config slot (existing src/config.ts substrate) - Index-time reconciler (mirrors recipe_recency from item 1.9) - New boundary_rules table (moat-B-aligned schema growth) - Bundled recipe boundary-violations.sql via SQLite GLOB operator - SARIF output formatter (already shipped) for CI gate NO new CLI flag — moat-A clean. The verb is query --recipe boundary-violations --format sarif. Recipe consumes config-as-data; SARIF output mode handles verdict-shaped CI consumers. Effort stays S. Element-types / layer sugar deferred to v1.x with explicit "demand-driven" trigger (mirrors fallow.md B.5 verdict- threshold deferral pattern, kept in this doc as the recurring deferral idiom). * docs(research): § 1.1, 1.6, 1.8 sanity sharpening (gotchas + envelopes) Grill-me Q14 outcome: three remaining § 1 rows had implicit gotchas the recipe author would otherwise have to discover during impl. Each row gets a small clarification — substrate unchanged, effort unchanged. § 1.1 components-touching-deprecated: - Was: "One bundled recipe (components-touching-deprecated)" - Now: explicit two-path UNION - HOOK PATH: components.hooks_used JSON overlap with @deprecated symbols (catches deprecated hooks like useDeprecatedThing) - CALL PATH: calls.caller_name IN (SELECT name FROM components) × @deprecated symbols by callee_name (catches regular deprecated functions called inside components) - Hook-only variants would ship false-negatives — recipe author needs the explicit UNION to avoid the trap. § 1.6 unused-type-members: - Was: "Recipe (unused-type-members) — needs JSON-extraction predicate" - Now: ADVISORY recipe with explicit caveat block in .md. Output is "review these" candidates, NEVER "safe to delete" — TS has multiple indirect-usage classes codemap's substrate doesn't track: - Indexed access: T['fieldName'] - keyof T - Type spreads: type X = T & {...} - Mapped types: {[K in keyof T]: ...} These produce false-positives. Recipe is useful as a candidate surfacer; agents must verify before deletion. § 1.8 more MCP resources: - Was: hand-wave "add codemap://files/{path} and codemap://symbols/ {name}" - Now: spell out disambiguation envelope (reuses {matches, disambiguation?} pattern from PR #39 show/snippet) — symbols with duplicate names across files (Component, index, default, util-name collisions) return all matches with by_kind / files / hint metadata. Plus ?in=<path-prefix> query parameter mirroring show --in <path>. - Without spelling this out, the implementation would have to invent disambiguation OR ship a "first match wins" gotcha. Net: each row's What's-needed cell now contains enough detail that the recipe / resource author can implement without re-deriving the JOIN structure or envelope shape. Tactical clarity layered on top of the structural decisions made in earlier grills.
Summary
Plan-only PR for the next backlog item:
codemap watch(the highest-impact remaining feature for AI-agent UX — eliminates the "is the index stale?" friction every CLI / MCP / HTTP query rides on today).Per docs-governance Rule 3, plans live at `docs/plans/.md` and get deleted on ship. This PR adds the plan; CodeRabbit gets a review pass; impl follows in a separate PR once decisions are settled.
What's in the plan
Behavior change
None — docs only. Implementation lands in a separate PR after this plan is reviewed + merged.
Test plan
Why ship the plan separately
Per the existing pattern (PR #43 SARIF, PR #44 serve), plans get a review pass before code lands. Three reasons:
Summary by CodeRabbit