Skip to content

feat(mcp): codemap mcp — agent-transports v1 (Tracer 1 of 7)#35

Merged
SutuSebastian merged 12 commits intomainfrom
feat/agent-transports-mcp
May 1, 2026
Merged

feat(mcp): codemap mcp — agent-transports v1 (Tracer 1 of 7)#35
SutuSebastian merged 12 commits intomainfrom
feat/agent-transports-mcp

Conversation

@SutuSebastian
Copy link
Copy Markdown
Contributor

@SutuSebastian SutuSebastian commented May 1, 2026

Summary

Implements docs/plans/agent-transports.md v1 (lifted into docs/architecture.md § MCP wiring and the agent rule + skill on ship per Rule 2). Exposes codemap's structural-query surface to agent hosts (Claude Code, Cursor, Codex, generic MCP clients) as JSON-RPC tools over stdio — eliminates the bash round-trip on every agent invocation.

Status: Ready for review. All 7 tracers shipped; all 5 plan § 12 grill questions settled (recorded inline in the plan before deletion, mirrored in the agent rule + skill).

Surface (one tool per CLI verb plus MCP-only query_batch, all snake_case)

Tool Purpose
query One read-only SQL statement. Same envelope as codemap query --json.
query_batch MCP-only — N statements in one round-trip. Items are string | {sql, summary?, changed_since?, group_by?}; batch-wide-defaults + per-statement-overrides. Per-statement errors isolated.
query_recipe Bundled recipe by id; recipe actions attach to each row.
audit {baseline_prefix, baselines}{head, deltas}. Auto-runs incremental index unless no_index.
save_baseline Polymorphic{name, sql? | recipe?} with runtime exclusivity check.
list_baselines / drop_baseline Catalog ops.
context Project-bootstrap envelope (saves 4-5 query calls at session start).
validate On-disk SHA-256 vs indexed files.content_hash.

Resources (lazy-cached on first read_resource): codemap://recipes, codemap://recipes/{id}, codemap://schema, codemap://skill.

Grill round (plan § 12) — all 5 settled

  1. context AND validate? Both — validate costs ~10 LOC and unlocks "codemap doctor" agents.
  2. Multi-statement SQL? query (one-statement, CLI parity) plus query_batch (MCP-only, polymorphic items, batch-wide-defaults + per-statement-overrides). Rejected ;-delimited batches in query (would need a SQL tokenizer + diverge output shape).
  3. Resource caching? Lazy memoise on first read_resource — constant for server-process lifetime.
  4. Naming? snake_case throughout (matches MCP spec, GitHub MCP, Cursor built-ins). CLI stays kebab — translation at MCP-arg layer.
  5. save_baseline shape? One polymorphic tool with runtime exclusivity check (mirrors single CLI --save-baseline=<name> verb). Rejected two-tools-split.

Tracer-bullet sequence (per plan § 11)

  • 1 — scaffoldcmd-mcp.ts + mcp-server.ts skeletons; ping stub; @modelcontextprotocol/sdk dep
  • 2 — query + query_batch — pure executeQuery engine extracted to src/application/query-engine.ts
  • 3 — query_recipe — recipe SQL + actions threading
  • 4 — audit + context + validate — reuses runAudit, buildContextEnvelope, computeValidateRows
  • 5 — save_baseline + list_baselines + drop_baseline — polymorphic save with runtime exclusivity
  • 6 — resources — 4 resources, lazy-cached
  • 7 — docs — architecture / glossary / README / rule + skill across .agents/ and templates/agents/ (Rule 10); plan deleted (Rule 2); minor changeset

Architecture

Mirrors the cmd-audit.ts ↔ audit-engine.ts seam from PR #33:

  • src/cli/cmd-mcp.ts — CLI shell (argv, --help, dispatch from main.ts)
  • src/application/mcp-server.ts — engine (tool registry, resource handlers, SDK wiring)
  • src/application/query-engine.ts — pure transport-agnostic engine extracted from printQueryResult's JSON branch (used by both MCP and codemap query --json going forward)

Bootstrap once at server boot; tool handlers reuse existing engine entry-points (executeQuery, runAudit, buildContextEnvelope, …) — no duplicate business logic.

Test plan

  • bun run check green on every push (29 new MCP tests via @modelcontextprotocol/sdk's InMemoryTransport — full integration coverage of every tool + resource + the polymorphic / error / inheritance edges)
  • bun src/index.ts mcp --help prints usage
  • Manual smoke test via npx @modelcontextprotocol/inspector bun src/index.ts mcp (post-merge — won't gate review)
  • CI green

Self-audit before pushing for review

  • .agents/ rules respected (tracer-bullets, verify-after-each-step, concise-comments swept on Tracer 7, docs-governance Rules 2 / 9 / 10)
  • ✅ Performance: bootstrap-once, lazy resources, per-batch changed_since memoisation, isolated per-statement errors
  • ✅ Architecture: clean engine seam, per-tool registration, isEnginePayloadError helper to dedup type-guards
  • ⚠️ Known follow-ups (deferred — separate PRs):
    • Lift 4 cli/* imports (resolveAuditBaselines, buildContextEnvelope, computeValidateRows, query-recipes) into application/ — currently documented layer-reversals
    • Server-lifetime changed_since cache (currently per-tool-call) — needs staleness-invalidation story before it's safe
    • Pool DB connection (currently openDb/closeDb per tool call) — measure first; bun:sqlite open is microseconds
    • Split mcp-server.ts when it crosses ~1000 lines (currently 770)

Composition with the recently shipped surface

Reuses every CLI primitive from PRs #26 / #28 / #30 / #33. HTTP API (codemap serve) stays in roadmap backlog — design points (tool taxonomy + output shape) reserved in architecture.md § MCP wiring so HTTP inherits them when its turn comes.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added codemap mcp command to expose Codemap functionality as JSON-RPC tools over stdio
    • Introduced batch query capability executing multiple statements in a single request with per-statement error isolation
    • Enabled MCP resource access for recipes, schema, and skill documentation with automatic caching
  • Dependencies

    • Added @modelcontextprotocol/sdk

First slice of docs/plans/agent-transports.md. Lands the SDK wiring
end-to-end: argv parser, help text, dispatch from main.ts, the
McpServer factory in src/application/mcp-server.ts, and a `ping`
stub tool that confirms stdio + JSON-RPC framing without depending
on any codemap engine.

Subsequent tracers replace the stub with real tools:

- 2: query (wraps runQueryCmd, tested via SDK in-process transport)
- 3: query_recipe (separate tool surface for recipe catalog)
- 4: audit (composes baselines via resolveAuditBaselines)
- 5: save_baseline / list_baselines / drop_baseline
- 6: codemap://recipes / recipes/{id} / schema / skill resources
- 7: docs (architecture.md MCP wiring, glossary, README, agent
     rule + skill across .agents/ and templates/agents/ per Rule 10)

Layering follows the cmd-audit.ts <-> audit-engine.ts seam from
PR #33 — cmd-mcp.ts owns argv + lifecycle, mcp-server.ts owns the
tool registry and SDK calls.

Open questions in plan § 12 still pending grill round before tracer 2;
nothing in this scaffold pre-commits any of them.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 1, 2026

🦋 Changeset detected

Latest commit: 02b316d

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@stainless-code/codemap Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

Warning

Rate limit exceeded

@SutuSebastian has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 44 minutes and 43 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 15698ce3-5cc6-41be-b3b8-d55b75d8d2ed

📥 Commits

Reviewing files that changed from the base of the PR and between 3fe2eb8 and 02b316d.

📒 Files selected for processing (12)
  • .agents/rules/codemap.md
  • .agents/skills/codemap/SKILL.md
  • README.md
  • docs/architecture.md
  • docs/glossary.md
  • src/application/mcp-server.test.ts
  • src/application/mcp-server.ts
  • src/application/query-engine.test.ts
  • src/application/query-engine.ts
  • src/cli/cmd-mcp.ts
  • templates/agents/rules/codemap.md
  • templates/agents/skills/codemap/SKILL.md
📝 Walkthrough

Walkthrough

This PR implements a complete MCP (Model Context Protocol) server for codemap, exposing structural query, audit, and baseline management functionality as JSON-RPC tools over stdio. It includes a new query engine, MCP server wiring, CLI integration, comprehensive tests, and extensive documentation updates spanning architecture, glossary, and agent integration templates.

Changes

Cohort / File(s) Summary
Core MCP Server Implementation
src/application/mcp-server.ts, src/application/mcp-server.test.ts
Introduces createMcpServer and runMcpServer functions that expose 9 MCP tools (query, query_batch, query_recipe, audit, context, validate, save_baseline, list_baselines, drop_baseline) with Zod validation, memoized git diff resolution, lazy-cached resources (codemap://recipes*, codemap://schema, codemap://skill), and consistent CLI-shaped response envelopes. Comprehensive test suite validates tool behavior, error isolation, baseline lifecycle, and resource discovery.
Query Execution Engine
src/application/query-engine.ts, src/application/query-engine.test.ts
Adds pure executeQuery and executeQueryBatch functions that run SQL statements, optionally filter by changed files, apply recipe actions, group results, and return --json envelope shapes with per-statement error isolation. Tests verify filtering, grouping, summary mode, and batch execution.
CLI Command Wiring
src/cli/cmd-mcp.ts, src/cli/cmd-mcp.test.ts, src/cli/bootstrap.ts, src/cli/main.ts
Introduces parseMcpRest, printMcpCmdHelp, and runMcpCmd to handle codemap mcp command dispatch. Updates bootstrap help text and CLI main dispatcher to recognize and route the mcp command. Tests verify parsing, help display, and error handling.
Core Documentation
docs/architecture.md, docs/glossary.md, docs/roadmap.md
Documents MCP server architecture (bootstrap flow, tool/resource mapping, git memoization), adds glossary entries for MCP and query_batch semantics, and refocuses roadmap to separate MCP (completed v1) from deferred HTTP codemap serve API.
Plan & Changeset Migration
.changeset/agent-transports-mcp-scaffold.md, docs/plans/agent-transports.md
Records new @stainless-code/codemap minor release with MCP feature and @modelcontextprotocol/sdk^1.29.0 dependency. Removes detailed agent-transports plan (now superseded by architecture/glossary docs).
Agent Integration Documentation
.agents/rules/codemap.md, .agents/skills/codemap/SKILL.md, templates/agents/rules/codemap.md, templates/agents/skills/codemap/SKILL.md, README.md
Adds agent-facing documentation defining MCP tool surface (tool arguments, output shapes, error handling), resource URIs, and launch instructions. Mirrors content across both .agents/ active rules and templates/ source templates.
Dependencies
package.json
Adds @modelcontextprotocol/sdk@^1.29.0 runtime dependency.

Sequence Diagram

sequenceDiagram
    actor Agent as Agent Host
    participant CLI as CLI Bootstrap
    participant MCP as MCP Server
    participant Engine as Query/Audit Engine
    participant DB as Database

    Agent->>CLI: codemap mcp (with --root, --config)
    CLI->>CLI: Parse & validate arguments
    CLI->>MCP: runMcpServer(opts)
    MCP->>MCP: bootstrapForMcp() - init codemap once
    MCP->>MCP: Create McpServer with StdioServerTransport
    MCP-->>Agent: Ready for JSON-RPC requests over stdin/stdout

    Agent->>MCP: {"jsonrpc":"2.0", "method":"query", "params":{...}}
    MCP->>MCP: Validate inputs with Zod
    MCP->>MCP: Memoize git diff for changed_since (if needed)
    MCP->>Engine: executeQuery(opts)
    Engine->>DB: Execute SQL statement
    DB-->>Engine: Row results
    Engine->>Engine: Apply grouping/summary/recipe actions
    Engine-->>MCP: QueryResultPayload (or {error})
    MCP->>MCP: Wrap in CLI --json envelope
    MCP-->>Agent: {"jsonrpc":"2.0", "result":{"content":[...]}}

    Agent->>MCP: {"jsonrpc":"2.0", "method":"query_batch", "params":{statements:[...]}}
    MCP->>Engine: executeQueryBatch(statements)
    Engine->>Engine: Map each statement through executeQuery
    Engine-->>MCP: Array<QueryResultPayload | ExecuteQueryError>
    MCP-->>Agent: Per-statement results (errors isolated)
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

Suggested Labels

enhancement, documentation

Poem

🐇 Protocol hops through stdio streams,
Tools and resources paint agent dreams,
Query batches bundled, baselines preserved—
JSON envelopes perfectly curved!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing an MCP server for codemap as part of agent-transports v1. It clearly specifies the feature ('codemap mcp'), the scope ('agent-transports v1'), and identifies this as Tracer 1 of a multi-step rollout.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/agent-transports-mcp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 44 minutes and 43 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Q1 — context AND validate as MCP tools (both ship in tracer 4
alongside audit). validate is a thin wrapper that costs ~10 LOC
and unlocks "codemap doctor" agents; defer-by-default would only
save the wrapping cost.

Q2 — query (one-statement, CLI parity) PLUS query_batch (MCP-only).
query_batch uses batch-wide-defaults + per-statement-overrides:

  statements: (string | {sql, summary?, changed_since?, group_by?})[]

String items inherit batch-wide flags; object items override on a
per-key basis. Output is an array of N elements, each shaped
exactly like single-query's output for the effective flag set.
SQL-only (no recipe polymorphism — query_recipe_batch is an
additive future change if asked).

Rejected: making query accept ;-delimited batches (would need a
real SQL tokenizer and would diverge query's output shape from
its CLI counterpart — violates plan § 4 uniformity).

Plan §§ 3, 5, 8, 11, 12 updated. Q3-Q5 still open.
Memoize on first read_resource call. All 4 resources are constant per server-process lifetime, so eager and lazy produce identical observable behavior — lazy just keeps boot lean for sessions that never read resources.
Every MCP reference implementation (spec, GitHub MCP, Cursor built-ins) uses snake. CLI stays kebab; the kebab→snake translation lives at the MCP-arg layer.
save_baseline({name, sql?, recipe?}) with runtime check that exactly one of sql/recipe is set. Mirrors the CLI's single --save-baseline verb. All 5 grill questions now settled — ready for tracer 2.
Wires the first two real MCP tools per plan §§ 3, 5, 11:

- `query` — wraps a single SELECT against .codemap.db. Returns the
  exact JSON envelope `codemap query --json` would print: row array
  by default, {count} under summary, {group_by, groups} under
  group_by. Mirrors plan § 4's uniformity commitment (MCP responses
  structurally identical to CLI output).

- `query_batch` — N statements in one round-trip. Items are either
  bare SQL strings (inherit batch-wide flags) or objects {sql,
  summary?, changed_since?, group_by?} that override on a per-key
  basis. Returns N envelopes; per-element shape mirrors single-
  query output for that statement's effective flag set. Per-statement
  errors are isolated — a failed statement returns {error} in its
  slot while siblings still execute (partial-success semantics that
  match what an agent expects when batching independent reads).

Layering follows the cmd-audit.ts <-> audit-engine.ts seam from
PR #33:

- src/application/query-engine.ts — pure transport-agnostic
  executeQuery + executeQueryBatch returning JSON envelopes (no
  console.log). Mirrors printQueryResult's logic but returns
  the data instead of printing it.
- src/application/mcp-server.ts — registers both tools, handles
  flag merging (string vs object form), resolves --changed-since
  refs to file sets via getFilesChangedSince (memoised per-ref
  across batch items so a batch with N items sharing the same ref
  does one git invocation, not N).

Tests: 8 engine tests cover default rows / summary / group_by /
error / changedFiles paths plus batch isolation; 8 in-process MCP
tests use @modelcontextprotocol/sdk's InMemoryTransport to verify
tools/list, single-query envelope, summary, error payload, batch
defaults, per-statement override, string-form inheritance, and
partial-error isolation.

Replaces the ping stub from Tracer 1.
Wires the query_recipe MCP tool per plan §§ 3, 5, 11. Looks up the
SQL + per-row actions from src/cli/query-recipes.ts (data registry,
no execution flow crosses cli → application — layer note in
mcp-server.ts), then delegates to executeQuery with recipeActions
threaded through. Per-row actions land verbatim on each output
row exactly as `codemap query --json --recipe <id>` would print.

Composes with the same flag set as `query` (summary, changed_since,
group_by) — same JSON envelope contract, same per-flag shape.
Unknown recipe id returns a structured {error} payload pointing
the agent at the codemap://recipes resource (lands in tracer 6).

Tests: 4 in-process MCP tests for tools/list, unknown-recipe error,
actions-attached-to-rows (via seeded @deprecated symbol), and
summary composition.
Wires three more MCP tools per plan §§ 3, 5, 11.

- audit — composes per-delta baselines (files/dependencies/deprecated)
  into the same {head, deltas} envelope `codemap audit --json` prints.
  Args: baseline_prefix (auto-resolves <prefix>-<deltaKey>), baselines
  (explicit per-delta override), summary (collapses each delta to
  {added: N, removed: N}), no_index (skip auto-incremental-index
  prelude — default re-indexes so head reflects current source).
  Reuses resolveAuditBaselines + runAudit from PR #33's engine
  unchanged; no new business logic.

- context — wraps buildContextEnvelope. Returns the same
  {codemap: {schema_version, ...}, project: {root, file_count, ...},
  hubs?, sample_markers?} envelope `codemap context --json` prints.
  The agent-shaped session-bootstrap call: one round-trip replaces
  4-5 query calls.

- validate — wraps computeValidateRows. Compares on-disk SHA-256 to
  indexed files.content_hash, returns rows with status
  ('ok'/'changed'/'missing'). Empty `paths: []` validates every
  indexed file. Unlocks "codemap doctor" agents that diagnose
  stale .codemap.db before issuing structural queries (the use case
  surfaced in plan § 12 Q1).

Tests: 5 new in-process MCP tests for tools/list (now expects
audit/context/validate in addition to query/query_batch/query_recipe),
audit's no-baseline-resolves error, audit envelope shape (using a
seeded snap-files baseline that matches the seeded files exactly →
no drift), context envelope shape smoke test, validate happy path.

Layer note: cmd-audit's resolveAuditBaselines, cmd-context's
buildContextEnvelope, and cmd-validate's computeValidateRows are
imported from src/cli/* (their CLI verb owns the function today).
Same layer-reversal allowance as query-recipes — pure functions /
data registry, no execution flow crosses cli → application.
…f 7)

Wires the three baseline MCP tools per plan §§ 3, 5, 11. Settled in
the grill round (plan § 12 Q5): save_baseline ships as ONE polymorphic
tool with optional sql / recipe inputs (mirrors the CLI's single
--save-baseline=<name> verb), with a runtime check that exactly one
of sql / recipe is set. Two-tools alternative was rejected — fragments
the surface for one conceptual operation.

- save_baseline({name, sql? | recipe?}) — runs the SQL (resolved from
  recipe id if needed), captures rows, upserts into query_baselines
  with name + recipe_id + sql + rows_json + row_count + git_ref +
  created_at. Reuses upsertQueryBaseline directly.
- list_baselines() — returns the array `codemap query --baselines
  --json` would print (no rows_json payload).
- drop_baseline({name}) — deletes the named baseline. Returns
  {dropped: <name>} on success or isError if name doesn't exist.

git_ref capture uses tryGetGitRefSafe (mirrors the helper in
cmd-query.ts; kept local to avoid a cli → application import).
git rev-parse may legitimately fail (no git, detached worktree) —
baselines record git_ref = NULL in that case.

Tests: 7 new in-process MCP tests cover tools/list, the runtime
exclusivity check (both ways), SQL save + list round-trip, recipe
save (recipe_id surfaces in payload), unknown-recipe error, and
drop-then-redrop semantics.
Wires the four MCP resources per plan §§ 7, 11. Settled in the grill
round (plan § 12 Q3): lazy memoisation — every resource is constant
for the server-process lifetime, so eager-vs-lazy produce identical
observable behavior; lazy keeps boot lean for sessions that never
call read_resource.

- codemap://recipes — full catalog JSON (same as --recipes-json).
  Reuses listQueryRecipeCatalog().
- codemap://recipes/{id} — one recipe ({id, description, sql,
  actions?}). Template form: list-callback enumerates one URI per
  recipe id so resources/list surfaces the catalog. Replaces
  --print-sql for agents.
- codemap://schema — DDL of every indexed table, queried live from
  sqlite_schema (lets the agent discover what columns exist without
  reading docs).
- codemap://skill — full text of templates/agents/skills/codemap/
  SKILL.md via resolveAgentsTemplateDir(). Agents that don't preload
  the bundled skill at session start fetch it here.

Caches are per-server-instance Map / single-string memos populated
on first read_resource call. Never invalidated — server process is
short-lived (agent host respawns it on package update or session
restart).

Tests: 5 new in-process tests cover resources/list (3 static + N
templated by recipe id), each resource's payload shape, and the
SKILL.md frontmatter sanity check.
…of 7)

Lifts the canonical bits out of docs/plans/agent-transports.md per
docs/README.md Rule 2 (delete plans on ship), with a small self-audit
cleanup pass on src/application/mcp-server.ts (Rule 9 + concise-comments).

Docs:

- architecture.md § CLI usage gains an "MCP wiring" paragraph
  documenting the cmd-mcp.ts ↔ mcp-server.ts seam, the engine
  reuse pattern (executeQuery, runAudit, etc.), tool naming
  (snake_case at MCP layer; CLI stays kebab), and lazy resource
  caching.
- glossary.md adds two entries under § M: `codemap mcp` / MCP
  server, and `query_batch` (MCP-only tool) — covers the new
  domain nouns per Rule 9.
- roadmap.md replaces the "Agent transports v1+v1.x" entry with
  just the v1.x slice (codemap serve / HTTP API), since v1
  shipped.
- README.md CLI block adds the MCP server example.
- .agents/rules/codemap.md + templates/agents/rules/codemap.md
  (mirrored per Rule 10): new MCP section in the table + an
  "MCP server (codemap mcp)" reference block listing all tools,
  query_batch's polymorphic shape, save_baseline's exclusivity,
  the four resources, and the output-shape uniformity guarantee.
- .agents/skills/codemap/SKILL.md + templates/agents/skills/
  codemap/SKILL.md (mirrored): full per-tool reference + per-
  resource reference, with implementation notes (cmd-mcp.ts ↔
  mcp-server.ts seam, changed_since memoisation per (root, ref)).
- docs/plans/agent-transports.md DELETED (Rule 2 — plan content
  fully lifted into architecture.md and the agent files).

Self-audit cleanup on mcp-server.ts:

- Outdated header docstring trimmed (was "Tracer 2 wires X;
  subsequent tracers add Y" — now all shipped).
- Extracted `isEnginePayloadError` type-guard helper to dedup the
  4-line error-payload narrowing (previously inlined in query,
  query_recipe, save_baseline handlers).
- mergeBatchItem comment slimmed (was 4 lines, now 1) per
  concise-comments rule.

Changeset bumped to minor (was patch placeholder for the scaffold)
and rewritten to describe the complete shipped surface — squash-merge
pulls all 7 tracers into one ship-defining commit.

Known follow-ups (deferred — separate PRs):

- Lift cli/* imports (resolveAuditBaselines, buildContextEnvelope,
  computeValidateRows, query-recipes) into application/ — currently
  4 layer-reversal imports, each documented but accumulating.
- Server-lifetime changed_since cache (currently per-tool-call) —
  needs a staleness-invalidation story before it's safe.
- Pool DB connection (currently openDb/closeDb per tool call) —
  measure first; bun:sqlite open is microseconds.
- Split mcp-server.ts when it crosses ~1000 lines (currently 770).
@SutuSebastian SutuSebastian marked this pull request as ready for review May 1, 2026 14:25
@SutuSebastian
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/architecture.md`:
- Around line 128-129: The claim that "per-statement errors are isolated" is too
broad because failures resolving changed_since currently abort the whole request
in registerQueryBatchTool(); either narrow the docs to say isolation only
applies to SQL/engine/runtime failures (leave
registerQueryBatchTool/executeQueryBatch unchanged), or change
registerQueryBatchTool() to resolve each item's changed_since inside the
per-item loop with a try/catch so failures produce a per-slot {error} result
(matching executeQueryBatch's per-statement semantics) and continue processing
other items; update the changed-since memoisation logic so the per-(root,ref)
cache is consulted/filled inside that per-item resolution path.

In `@docs/glossary.md`:
- Around line 246-252: The docs incorrectly state snake_case “matches MCP spec”;
update the glossary entries for `codemap mcp` and `query_batch` to say that tool
input/output keys use snake_case as a Codemap convention (not a protocol
requirement). Edit the wording around "Tool input/output keys are snake_case"
to: e.g., "Tool input/output keys use snake_case (a Codemap convention); CLI
stays kebab — translation at the MCP-arg layer." Keep references to `codemap
mcp`/`query_batch` and the implementation files `src/cli/cmd-mcp.ts` and
`src/application/mcp-server.ts` so readers can find the code paths that perform
the kebab↔snake translations.

In `@README.md`:
- Around line 113-117: Reword the README wording that implies “one tool per CLI
verb” to explicitly mark that query_batch is an MCP-only tool: update the
sentence around "codemap mcp" and the tools list so it reads something like
“codemap mcp — JSON-RPC on stdio (MCP-only tools listed below); most commands
map to one CLI verb, except MCP-only tools such as query_batch.” Ensure the
symbol query_batch is named and that the phrase "one tool per CLI verb" is
adjusted to avoid implying query_batch is a normal CLI verb.

In `@src/application/mcp-server.ts`:
- Around line 245-254: The current batch handler returns immediately on a single
changed_since resolution error (using jsonError) which aborts the entire batch;
instead, change the logic in the anonymous handler (the function that calls
makeChangedFilesResolver, mergeBatchItem and resolveChanged) so that when
resolveChanged(merged.changed_since) yields an error object you do not return
early but record an item-level error into the resolved array (preserving the
original statement order) — e.g., push a BatchStatementResolved entry
representing the per-statement error (including the error payload) and continue
processing subsequent statements; apply the same per-item error-capture fix to
the similar block around resolveChanged handling at the 262-267 region so each
statement can succeed/fail independently.
- Line 132: Add a concise JSDoc block above the exported createMcpServer
function that documents its purpose, the shape and required/optional properties
of the ServerOpts parameter, and what the returned McpServer represents and
guarantees (e.g., lifecycle methods or interfaces it implements); reference the
createMcpServer function name and the ServerOpts/McpServer types in the
description and include parameter (`@param`) and return (`@returns`) tags so the
exported API is clearly specified for consumers and tooling.

In `@src/application/query-engine.ts`:
- Around line 40-51: Add short JSDoc comments above the exported types to
document their public API: describe QueryResultPayload as the possible shapes
returned from executeQuery (array of records, count-only object, grouped results
with GroupByMode and groups array, or grouped key/count pairs) and document
ExecuteQueryError as the error object returned on failure with an error string
field; update the comments located near the exported symbols QueryResultPayload
and ExecuteQueryError to follow the project’s public API documentation style.
- Around line 59-66: The code calls db.query(opts.sql) in executeQuery without
enforcing read-only semantics, allowing mutations; change executeQuery to open
the DB in read-only mode or enable SQLite's query-only mode before executing
caller SQL (e.g., call openDb with a readOnly flag or call db.exec("PRAGMA
query_only = ON") on the returned handle), and validate/escape opts.sql only
accepts read-only statements (e.g., starts with SELECT/PRAGMA) as a second-line
defense; update references in executeQuery (and openDb if needed) to use the
read-only option so the engine cannot perform DROP/DELETE/UPDATE from opts.sql.

In `@src/cli/cmd-mcp.ts`:
- Around line 35-38: Update the MCP command help text in the src/cli/cmd-mcp.ts
module (the command definition / help string for the "mcp" command) to reflect
the current surface: list the available verbs (query, query_recipe, audit,
baseline ops, context, validate, plus ping) instead of saying "Tracer 1 only
ships the ping stub", and remove or replace the pointer to
docs/plans/agent-transports.md with either an accurate docs link or omit it;
edit the help/description string used by the command registration (the variable
or literal passed into the command builder for "mcp") so running "codemap mcp
--help" shows the correct commands and guidance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eeaa1775-73e4-4580-b2a8-7b682567a8aa

📥 Commits

Reviewing files that changed from the base of the PR and between 7718691 and 3fe2eb8.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (19)
  • .agents/rules/codemap.md
  • .agents/skills/codemap/SKILL.md
  • .changeset/agent-transports-mcp-scaffold.md
  • README.md
  • docs/architecture.md
  • docs/glossary.md
  • docs/plans/agent-transports.md
  • docs/roadmap.md
  • package.json
  • src/application/mcp-server.test.ts
  • src/application/mcp-server.ts
  • src/application/query-engine.test.ts
  • src/application/query-engine.ts
  • src/cli/bootstrap.ts
  • src/cli/cmd-mcp.test.ts
  • src/cli/cmd-mcp.ts
  • src/cli/main.ts
  • templates/agents/rules/codemap.md
  • templates/agents/skills/codemap/SKILL.md
💤 Files with no reviewable changes (1)
  • docs/plans/agent-transports.md

Comment thread docs/architecture.md Outdated
Comment thread docs/glossary.md
Comment thread README.md
Comment thread src/application/mcp-server.ts
Comment thread src/application/mcp-server.ts
Comment thread src/application/query-engine.ts
Comment thread src/application/query-engine.ts
Comment thread src/cli/cmd-mcp.ts Outdated
…6 Minor)

All 8 verified correct against the actual code; all applied.

Critical:

- query-engine.ts: enforce read-only at the engine boundary via
  PRAGMA query_only = 1. Without it, agents could DROP / DELETE /
  UPDATE through `query` / `query_recipe` / `query_batch` despite
  these tools being contractually read-only. SQLite-level
  enforcement — parser-proof, doesn't bleed across calls (closeDb
  discards the connection). Two new tests confirm DML and DDL
  are rejected and the data is preserved.

Major:

- mcp-server.ts: query_batch now isolates changed_since failures
  per slot. A bad git ref in statement i lands in slot i with
  {error} instead of aborting the whole batch — matches the
  per-statement isolation contract documented in plan § 5 and
  executeQueryBatch's docstring. Refactored the loop to map over
  args.statements with a per-item try/catch. Dropped the now-unused
  executeQueryBatch import (the helper still ships as a public
  query-engine API for non-MCP callers; tests cover it). New test
  uses a deliberately-bad ref to confirm the sibling statement
  still runs.

Minor:

- mcp-server.ts: createMcpServer now carries a JSDoc contract
  (purpose, opts contract, lifecycle ownership note for tests vs
  prod).
- query-engine.ts: QueryResultPayload + ExecuteQueryError exported
  types now carry JSDoc covering shape discriminants and narrowing
  guidance.
- cmd-mcp.ts: --help text refreshed — was still describing Tracer 1
  with only the ping stub and pointing at docs/plans/agent-
  transports.md (which was deleted in Tracer 7). Now lists every
  shipped tool + resource and points at architecture.md.
- README.md: clarified `query_batch` is MCP-only (it isn't a CLI
  verb, so "one tool per CLI verb" read too literally).
- glossary.md / architecture.md / .agents rule + skill / templates
  rule + skill (mirrored across all 6 surfaces per Rule 10):
  reworded "matches MCP spec" → "Codemap convention matching MCP
  spec examples + reference servers (GitHub MCP, Cursor built-ins);
  spec is convention-agnostic." The protocol doesn't mandate
  snake_case — it's our convention informed by the ecosystem.

Architecture L129 claim ("per-statement errors are isolated") is
no longer narrowed — the underlying behavior in mcp-server.ts now
matches the broad claim, so the docs stay as-is.
@SutuSebastian SutuSebastian merged commit 119db38 into main May 1, 2026
9 checks passed
@SutuSebastian SutuSebastian deleted the feat/agent-transports-mcp branch May 1, 2026 14:43
SutuSebastian added a commit that referenced this pull request May 1, 2026
)

* docs(research): refresh fallow.md + scan against current ship state

fallow.md gains a "Status snapshot (as of 2026-05-01)" section that
tabulates every adoption candidate's ship status — single source of
truth for "what's open" without munging the original tier tables.

Captures:
- Tier A all shipped (PR #26)
- B.5 partial (v1 in PR #33; --base <ref> + verdict deferred to v1.x)
- B.6 shipped (PR #30) — table-in-DB, not parallel JSON files
- B.7 shipped (PR #28) — landed on `symbols`, not `exports`
- B.8 / C.9 / C.10 / C.11 / D.* still as-was
- MCP server (agent-transports v1) shipped in PR #35 (adjacent —
  not a numbered fallow candidate but worth surfacing here)

§ 6 open questions: marks the 2 settled ones (actions ownership,
audit verdict default) with their resolution PRs; preserves the 2
still-open ones (coverage column shape, plugin layer scope).

§ 3 already-shipped block: updates the visibility-tags note to
acknowledge B.7 promoted it from regex to structured column instead
of saying "B.7 proposes promoting" (which it doesn't anymore).

competitive-scan-2026-04.md § 4: marks MCP server wrapping `query`
as ✅ shipped via PR #35 with a cross-link to fallow.md's status
snapshot. Other items still tracked there.

No behavior change; pure docs refresh to match current reality.

* docs(research): fix MD056 — D row in fallow.md status snapshot was 4 cells, header was 5

CodeRabbit caught: status-snapshot table header has 5 columns (Tier / # / Item / Status / Where) but the D.12-D.16 row only had 4 (collapsed Status + Where into one cell). Markdown parses that as a malformed table; renderers either drop the row or misalign neighbouring rows. Added the missing 5th cell pointing back at § 1's Defer / skip table for the per-row reasoning.

* docs(research): align B.7 row title to shipped column name (symbols.visibility)

CodeRabbit caught: Tier B table B.7 row title still said 'exports.visibility column' despite the body hedging '(or symbols)' AND the shipped column landing on symbols. Status snapshot row at L22 already says symbols. Updated the title to match shipped reality + added an explicit nod to the original hedge so the historical-record property survives.
SutuSebastian added a commit that referenced this pull request May 1, 2026
…ckstop

Defence in depth: lexical scan rejects DML/DDL at load with recipe-aware error UX (fires in CI / pre-commit). PRAGMA query_only runtime backstop from PR #35 stays as the parser-proof safety net for what lexical can't catch (WITH clauses, multi-statement, attached DBs).

All 6 grill questions now settled — ready for tracer 1.
SutuSebastian added a commit that referenced this pull request May 2, 2026
… (Tracer 5 of 6)

Closes the Q-D + Q-F open questions from the grill round.

Q-D — actions for project-local recipes:
- Hand-rolled YAML frontmatter parser in extractFrontmatterAndBody (~30 LOC core, ~50 LOC including helpers). Strict shape: one optional 'actions' list of {type, auto_fixable?, description?} between --- delimiters at the top of <id>.md. Other top-level keys tolerated (forward-compat for future recipe metadata). Unknown action keys silently ignored. Items missing 'type' are filtered out (defensive).
- Lifted the 6 bundled recipes' actions (fan-out, fan-in, files-largest, deprecated-symbols, visibility-tags, barrel-files) from BUNDLED_RECIPE_ACTIONS in cli/query-recipes.ts into YAML frontmatter on each templates/recipes/<id>.md. The map is gone — uniform shape for both bundled and project recipes (Q-A's promised 'one storage shape, one loader code path').

Q-F — load-time DML/DDL lexical check:
- validateRecipeSql exported from recipes-loader. Strips -- comments, finds first identifier-shaped token, rejects if in deny-list (INSERT/UPDATE/DELETE/DROP/CREATE/ALTER/ATTACH/DETACH/REPLACE/TRUNCATE/VACUUM/PRAGMA). Recipe-aware error message points at --save-baseline as the legitimate path for capturing rows.
- Runtime PRAGMA query_only=1 backstop from PR #35 stays unchanged — different jobs: lexical = good UX for common mistakes; backstop = correctness for what slips by.

Lessons re-learned (already in .agents/lessons.md): backticks containing colons in line/block comments break Bun's parser; /* */ inside backticks closes the surrounding /** */ JSDoc. Avoided both by replacing problematic backticks with plain quotes / parentheses.

Tests: 27 new — 13 for validateRecipeSql, 7 for extractFrontmatterAndBody, 1 integration confirming actions + description both populate from a single .md. Total now 54 pass on the loader + shim test files.
SutuSebastian added a commit that referenced this pull request May 2, 2026
…cal .codemap/recipes/ (#37)

* docs(plans): draft recipes-content-registry

Pair bundled recipes with sibling .md (when-to-use / follow-up SQL); enable project-local recipes via .codemap/recipes/<id>.{sql,md}; auto-inherit into the codemap://recipes / recipes/{id} MCP resources shipped in PR #35.

Plan covers storage layout (file-pair vs YAML-frontmatter — file-pair wins per editor + LSP support), loader contract (eager + cached, pure transport-agnostic engine in src/application/recipes-loader.ts), CLI surface (zero new flags — same shape; --recipes-json gains source + body fields), and a 6-commit tracer-bullet sequence.

6 open questions worth a grill round before code: bundled storage layout, loading time (eager vs lazy), monorepo discovery walk-up, actions for project recipes (skip / frontmatter / sibling .json), conflict resolution noise level, and load-time DML/DDL rejection. Status: design pass; not yet implemented.

* docs(plans): settle Q-A — file-pair storage for bundled recipes

templates/recipes/<id>.{sql,md} for both bundled and project recipes. One loader code path, SQLite syntax highlighting in every editor, single-file diffs, sqlite3 .read works for ad-hoc testing. Migration is ~15 files; shim layer in cli/query-recipes.ts preserves backwards-compat exports.

* docs(plans): settle Q-B — eager registry load at startup

~15-20 small file reads is sub-millisecond on warm SSD; rounding error vs node/bun startup. 'Registry is always populated' kills lazy guards across three call sites; surfaces malformed-recipe errors early. Rejected disk cache — over-engineered for static SQL strings.

* docs(plans): document DB-vs-filesystem rationale + gitignore verification + settle Q-C (root-only)

Folds two grill-round insights into the plan:

1. § 3.2 gains a 'Gitignore note (verified, not assumed)' paragraph — git check-ignore confirmed .codemap/recipes/ is NOT matched by the existing .codemap.* literal-dot pattern. Project recipes are checked into git by default, intended behavior. Consumer-side risk (their own .gitignore using .codemap*) is documented; agent rule + skill will recommend !.codemap/recipes/ un-ignore.

2. New § 3.3 'Why filesystem and not .codemap.db' captures the side-by-side test against query_baselines (which IS in DB, opposite call): nature (output vs input), index-state coupling, human-authored-for-review, meaningful-outside-one-DB. Records the 'send a recipe to a colleague' deciding test (file: send the .sql; DB: reinvent files via export/import). Bundled-recipes-in-npm-package angle reinforces.

3. Q-C settled: root-only (<projectRoot>/.codemap/recipes/). Walk-up would make recipes the only codemap piece resolving differently from .codemap.db / indexer / resolver. Forward-compatible: root-only-→-walk-up is non-breaking; the reverse would be.

* docs(plans): settle Q-D — YAML frontmatter on .md for project recipe actions

Hand-rolled parser (~30 LOC) handles only the shallow shape codemap needs (key/list/string/bool). Frontmatter co-locates the action with its prose. Project recipes feel first-class with the same actions template surface bundled recipes have. Rejected gray-matter / js-yaml (~50KB for full YAML 1.2 spec we don't need) and sibling .actions.json (wrong factoring — separates action from explanation).

* docs(plans): settle Q-E — silent shadowing + shadows flag + agent-skill prompt update

Three-layer answer optimised for agent DX + traceability:

1. Silent at runtime (matches user-code-wins convention).
2. shadows: true flag in catalog responses (--recipes-json, codemap://recipes, codemap://recipes/{id}) — discovery-time provenance.
3. Bundled skill prompt instructs agents to read codemap://recipes at session start + check shadows.

Per-execution response shape stays unchanged (preserves plan § 4 uniformity). Stderr warnings rejected (MCP-stderr logs don't surface to the model anyway). --allow-shadow flag rejected (hostile to legitimate override case). Loader cost: ~5 LOC for the shadow check.

* docs(plans): settle Q-F — load-time lexical check + runtime PRAGMA backstop

Defence in depth: lexical scan rejects DML/DDL at load with recipe-aware error UX (fires in CI / pre-commit). PRAGMA query_only runtime backstop from PR #35 stays as the parser-proof safety net for what lexical can't catch (WITH clauses, multi-statement, attached DBs).

All 6 grill questions now settled — ready for tracer 1.

* feat(recipes): loader scaffold + merge logic (Tracer 1 of 6)

Pure transport-agnostic loader in src/application/recipes-loader.ts (mirrors the cmd-* ↔ *-engine seam from PR #33). Scope per plan §8 Tracer 1:

- LoadedRecipe interface (canonical shape; bundled + project share it)
- RecipeAction interface lifted from cli/query-recipes.ts (will become the canonical home; query-recipes becomes a shim in Tracer 2)
- readRecipesFromDir(dir, source) — reads <id>.sql, pairs with optional <id>.md (description = first non-empty line, body = full text). Returns [] for missing/non-directory paths (project-recipes case where .codemap/recipes/ is absent — not an error). Throws on empty SQL with recipe-aware message
- mergeRecipes(bundled, project) — project wins on id collision; sets shadows: true on overriding entries (Q-E settled). Output sorted by id (deterministic catalog order)
- loadAllRecipes({bundledDir, projectDir}) — Tracer 1 wires bundled only; projectDir argument accepted but stubbed (returns []). Tracer 3 plugs project loader

15 unit tests cover: missing dir, non-.sql ignore, sql-only loading, sibling-md pairing, heading-strip in description, deterministic id order, empty-sql rejection, comments-then-sql happy path, non-directory passthrough, all 4 merge cases (project-only / bundled-only / shadow / no-overlap), Tracer 1 stub behavior.

Layer note: query-recipes.ts (cli/) still owns QUERY_RECIPES + getQueryRecipeSql / getQueryRecipeActions / listQueryRecipeCatalog / listQueryRecipeIds. Tracer 2 migrates them to call into this loader.

* feat(recipes): migrate bundled recipes to templates/recipes/<id>.{sql,md} (Tracer 2 of 6)

QUERY_RECIPES TypeScript object map → templates/recipes/<id>.sql + sibling .md description files. cli/query-recipes.ts becomes a thin shim that calls loadAllRecipes() at first access and caches the result.

12 bundled recipes migrated: fan-out, fan-out-sample, fan-out-sample-json, fan-in, index-summary, files-largest, components-by-hooks, markers-by-kind, deprecated-symbols, visibility-tags, files-hashes, barrel-files. Each gets a .sql file (verbatim) + .md (description body — first non-empty line becomes the catalog 'description').

Backwards-compat preserved:
- QUERY_RECIPES exported as a Proxy so callers (cmd-query.ts, mcp-server.ts) can still use the legacy object-shape access (QUERY_RECIPES['fan-out'].description, Object.keys(QUERY_RECIPES), etc.) without changes
- getQueryRecipeSql / getQueryRecipeActions / listQueryRecipeIds / listQueryRecipeCatalog all derive from the registry — same return shapes
- Smoke tested: bun src/index.ts query --recipes-json + query --print-sql fan-out + query-golden all green

Bundled recipe actions stay in code (BUNDLED_RECIPE_ACTIONS map) through Tracer 5 — that tracer adds the YAML frontmatter parser and lifts these into the .md files alongside descriptions, completing the migration.

New: resolveBundledRecipesDir() in cli/query-recipes.ts mirrors resolveAgentsTemplateDir()'s npm-package layout (templates/recipes/ next to templates/agents/). _resetRecipesCacheForTests() escape hatch added for fixture swaps.

templates/ already shipped in the npm artifact (per package.json files); templates/recipes/ inherits.

Tracer 1's loader now has a real consumer; Tracer 3 will plug in projectDir for .codemap/recipes/<id>.sql discovery.

* feat(recipes): project-local loader for .codemap/recipes/<id>.sql (Tracer 3 of 6)

Wires up the actually-new user-facing capability per plan §1: teams ship internal SQL recipes via git-tracked .codemap/recipes/<id>.sql files.

Three pieces:

1. loadAllRecipes now reads opts.projectDir (was stubbed in Tracer 1). Composes via mergeRecipes — project wins on id collision with shadows: true flag (per Q-E settled).

2. resolveProjectRecipesDir(projectRoot) — root-only resolution per Q-C (no walk-up). Returns undefined if .codemap/recipes/ is missing or is a file rather than a directory; absence is not an error.

3. cli/query-recipes.ts shim's getRegistry() now resolves projectDir via getProjectRoot() (falls back to bundled-only if initCodemap hasn't run — covers direct unit-test paths). Cache key includes projectDir so multi-root sessions (test fixtures) re-resolve cleanly. _resetRecipesCacheForTests clears both halves.

5 new loader-engine tests: bundled-only / bundled+project / shadow detection / sorted ordering / missing-dir. 7 new shim tests: 3 for resolveProjectRecipesDir (absent / present / file-not-dir) + 4 for the end-to-end shim path (bundled-only baseline / project-local id surfaces / project shadows bundled / catalog merging).

Project recipes get actions: undefined through Tracer 5 — that tracer adds the YAML frontmatter parser.

* feat(recipes): catalog payload carries source / body / shadows (Tracer 4 of 6)

Extends QueryRecipeCatalogEntry with three new additive fields:

- body — full Markdown body of sibling <id>.md (description = first non-empty line; body = long-form 'when to use' / 'follow-up SQL' content)
- source — 'bundled' | 'project' (provenance discriminator)
- shadows — true ONLY on project entries that override a bundled recipe of the same id (per Q-E settled — agents check this at session start to know when a recipe behaves differently from the documented bundled version)

All additive: existing callers that destructure {id, description, sql, actions?} keep working unchanged.

New helper: getQueryRecipeCatalogEntry(id) — same shape as listQueryRecipeCatalog entries, for one id (undefined for unknown). Used by codemap://recipes/{id} MCP resource so the per-id payload includes the same provenance fields the full catalog has.

MCP server changes:
- codemap://recipes/{id} payload now includes body / source / shadows (replaced the inline {id, description, sql, actions?} construction with JSON.stringify(getQueryRecipeCatalogEntry(id)))
- codemap://recipes list-callback uses listQueryRecipeCatalog() (drops dependency on the legacy QUERY_RECIPES Proxy access)
- Resource description updated to 'Single recipe by id: {id, description, body?, sql, actions?, source, shadows?}'
- Removed unused listQueryRecipeIds + QUERY_RECIPES imports

5 new shim tests: bundled.source, bundled.body presence, project.source, project.shadows=true on override, getQueryRecipeCatalogEntry parity + unknown-id-undefined.

Tracer 5 next: YAML frontmatter parser for project-recipe actions + load-time DML/DDL lexical check.

* feat(recipes): YAML frontmatter actions + load-time DML/DDL deny-list (Tracer 5 of 6)

Closes the Q-D + Q-F open questions from the grill round.

Q-D — actions for project-local recipes:
- Hand-rolled YAML frontmatter parser in extractFrontmatterAndBody (~30 LOC core, ~50 LOC including helpers). Strict shape: one optional 'actions' list of {type, auto_fixable?, description?} between --- delimiters at the top of <id>.md. Other top-level keys tolerated (forward-compat for future recipe metadata). Unknown action keys silently ignored. Items missing 'type' are filtered out (defensive).
- Lifted the 6 bundled recipes' actions (fan-out, fan-in, files-largest, deprecated-symbols, visibility-tags, barrel-files) from BUNDLED_RECIPE_ACTIONS in cli/query-recipes.ts into YAML frontmatter on each templates/recipes/<id>.md. The map is gone — uniform shape for both bundled and project recipes (Q-A's promised 'one storage shape, one loader code path').

Q-F — load-time DML/DDL lexical check:
- validateRecipeSql exported from recipes-loader. Strips -- comments, finds first identifier-shaped token, rejects if in deny-list (INSERT/UPDATE/DELETE/DROP/CREATE/ALTER/ATTACH/DETACH/REPLACE/TRUNCATE/VACUUM/PRAGMA). Recipe-aware error message points at --save-baseline as the legitimate path for capturing rows.
- Runtime PRAGMA query_only=1 backstop from PR #35 stays unchanged — different jobs: lexical = good UX for common mistakes; backstop = correctness for what slips by.

Lessons re-learned (already in .agents/lessons.md): backticks containing colons in line/block comments break Bun's parser; /* */ inside backticks closes the surrounding /** */ JSDoc. Avoided both by replacing problematic backticks with plain quotes / parentheses.

Tests: 27 new — 13 for validateRecipeSql, 7 for extractFrontmatterAndBody, 1 integration confirming actions + description both populate from a single .md. Total now 54 pass on the loader + shim test files.

* docs(recipes): architecture/glossary/README/agent rule + skill (Tracer 6 of 6)

Lifts canonical bits out of docs/plans/recipes-content-registry.md per docs/README.md Rule 2 (delete plans on ship). Surfaces touched:

- architecture.md § CLI usage gains a 'Recipes wiring' paragraph documenting the recipes-loader.ts ↔ query-recipes.ts seam, file-pair storage layout (templates/recipes/ for bundled, .codemap/recipes/ for project), shadow flag + load-time DML/DDL validation, and the .codemap.db-vs-.codemap/recipes/ gitignore asymmetry.

- glossary.md § R: 'recipe' definition expanded to disambiguate bundled vs project sources, surface the actions-via-frontmatter shape, validation, and runtime backstop. New entry 'recipe shadows' covering the override discovery pattern.

- roadmap.md: removed the recipes-as-content-registry backlog entry (now shipped).

- README.md CLI block: added a project-recipes example showing mkdir + echo + --recipe lookup; mentions the shadows discovery field.

- .agents/rules/codemap.md + templates/agents/rules/codemap.md (mirrored per Rule 10): new 'Project-local recipes' bullet right after Recipe actions covers the .codemap/recipes/ location, shadows: true catalog flag, YAML frontmatter shape, and load-time DML/DDL rejection.

- .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md (mirrored): codemap://recipes resource description gains the 'check shadows at session start' guidance + source/shadows fields; codemap://recipes/{id} payload shape extended to {id, description, body?, sql, actions?, source, shadows?}; new 'Project-local recipes' bullet in the recipe section gives agents the full reference.

- docs/plans/recipes-content-registry.md DELETED (Rule 2 — plan content fully lifted into architecture.md / glossary.md / agent files).

- Minor changeset added (additive features, no schema breaks).

* fix(recipes): address PR #37 CodeRabbit feedback (1 Major + 6 Minor)

All 7 verified valid against actual code; all applied.

Major:

- recipes-loader.ts: stripLineComments now strips block /* */ comments
  BEFORE the line-comment + first-keyword scan. Without this, the
  deny-list could be bypassed two ways:
  (1) A leading block comment containing a deny-listed keyword could
      cause the lexer to misclassify a legit SELECT as DDL.
  (2) /* SELECT */ DELETE FROM x would be accepted because the lexer
      saw 'SELECT' in the comment first.
  Now: regex strips block comments first, then line comments, then
  first-identifier match runs. Pure-block-comment files also trip the
  empty-recipe check correctly. The runtime PRAGMA query_only=1
  backstop is still the parser-proof safety net for things like
  string-literal-embedded comments (vanishingly rare). 3 new tests
  cover false-positive avoidance, smuggled-DELETE rejection, and
  pure-block-comment-as-empty.

Minor:

- glossary.md § Conventions: 'Recipe = bundled SQL string in
  src/cli/query-recipes.ts' was outdated (recipes are now files in
  templates/recipes/ + .codemap/recipes/; query-recipes.ts is the
  shim). Reworded with a forward-pointer to § R recipe.

- README.md CLI block: language IN ('ts', 'tsx') instead of
  language = 'typescript'. Verified via codemap query — the indexer
  stores 'ts' / 'tsx' / 'md' / 'json' etc., not the long form. The
  example as written would have returned 0 rows.

- .agents/rules/codemap.md + templates/agents/rules/codemap.md
  + .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md
  (mirrored 4-way per Rule 10): the YAML frontmatter doc was showing
  inline-flow shape (actions: [{type, ...}]) but the loader's hand-
  rolled parser only accepts block-list (- type:). Authors copying
  inline form would silently lose actions. Replaced with a fenced
  block showing the correct block-list form.

- templates/recipes/deprecated-symbols.md: WHERE name → WHERE
  callee_name. Verified via pragma_table_info — the calls table has
  caller_name + callee_name + caller_scope + file_path + id; no
  bare 'name' column. The recipe doc would have pointed agents at
  an invalid query.

- templates/recipes/fan-in.md + fan-out.md: 'most-imported' /
  'they import from many other files' → 'most depended-on' /
  'they depend on many other files'. The dependencies table
  aggregates static imports + dynamic imports + resolved module-
  graph edges, so the import-only framing was narrower than what
  the metric measures.
SutuSebastian added a commit that referenced this pull request May 2, 2026
… status snapshot (#38)

The Adjacent — also shipped post-refresh block already listed PR #35 (MCP server). Adding PR #37: bundled recipes migrated to templates/recipes/<id>.{sql,md}, project-local recipes via .codemap/recipes/, catalog gains source/body/shadows fields, YAML frontmatter actions, load-time DML/DDL deny-list. Pure docs refresh — no behavior change.
SutuSebastian added a commit that referenced this pull request May 2, 2026
Mirrors the every-verb-becomes-a-tool pattern from PR #35. Discoverability win matters for agents that don't know the symbols schema; token savings compound. ~25 LOC registration; reuses the engine helper.
SutuSebastian added a commit that referenced this pull request May 2, 2026
Agent-first: gives data + structured warning; preserves agent autonomy (e.g. 'I want stale to compare with what changed'). Refuse + auto-reindex both rejected — refuse forces 3 round-trips for content already on disk; auto-reindex hides side-effects from a read tool and breaks the read/write separation we kept clean across PRs #33 / #35 / #37.

All 6 grill questions now settled — ready for tracer 1.
SutuSebastian added a commit that referenced this pull request May 2, 2026
Pure transport-agnostic lookup engine — same shape audit-engine.ts / query-engine.ts use (PRs #33 / #35). findSymbolsByName({db, name, kind?, inPath?}) returns SymbolMatch[] with deterministic order (file_path ASC, line_start ASC) so callers slice for stable disambiguation output.

Per Q-3 settled: name match is case-sensitive (exact). Per Q-4 settled: inPath uses a directory-vs-file heuristic — trailing slash OR no extension in trailing segment treats as prefix (LIKE 'src/cli/%'); else exact file match (file_path = ?). Caller normalizes via toProjectRelative before passing.

12 unit tests cover: single match, unknown name, ambiguous (3-match deterministic order), kind filter narrowing, inPath as directory (no slash + with slash), inPath as file (exact + miss), kind+inPath compose AND, returned columns, case-sensitivity.

Reuses the symbols table directly. No schema change. Tracer 2 wires the CLI verb on top.
SutuSebastian added a commit that referenced this pull request May 2, 2026
Wires the show + snippet CLI verbs as MCP tools per Q-1 settled. Both follow the established cmd-* ↔ register*Tool pattern from PR #35; both reuse the same engine helpers (findSymbolsByName, buildShowResult, buildSnippetResult) so output shape is verbatim from each tool's CLI counterpart's --json envelope.

- registerShowTool — args {name, kind?, in?}, returns the {matches, disambiguation?} envelope. Tool description teaches: 'Use snippet for source text; use query with LIKE for fuzzy lookup' so agents know when to reach for which tool.

- registerSnippetTool — args {name, kind?, in?}, returns the same envelope with source/stale/missing on each match. Description spells out the stale semantics (read + flag, agent decides) since that's the one non-obvious bit.

Both tools route the in arg through toProjectRelative(opts.root, args.in) so MCP callers get the same path-shape leniency as the CLI (--in ./src/cli/, --in src/cli, --in src/cli/cmd-show.ts all work identically).

8 new in-process MCP tests via @modelcontextprotocol/sdk's InMemoryTransport: tools/list lists both, single-match envelope, multi-match disambiguation, in-filter narrows, unknown-name returns empty, snippet source on fresh file (stale: false), stale flag on hash drift, missing flag on rm'd file.

Total now 38 MCP tests pass.
SutuSebastian added a commit that referenced this pull request May 2, 2026
* docs(plans): draft targeted-read-cli (codemap show)

One-step CLI verb for 'where is this symbol' — codemap show <symbol> returns file_path:line_start-line_end + signature. Pure ergonomic affordance over SELECT … FROM symbols WHERE name = ?; no schema change.

Plan covers surface (show + --all + --kind + --in flags), wiring (cmd-show.ts + show-engine.ts mirroring cmd-context/cmd-validate), MCP integration via the plan §35 pattern, and a 4-commit tracer-bullet sequence (~half day).

5 open questions worth a grill round before code: MCP tool registration, multiple-match UX (error vs list), exact vs fuzzy matching, file-scope filter, snippet-sibling timing. Status: design pass; not yet implemented.

* docs(plans): settle Q-1 — show ships as a dedicated MCP tool

Mirrors the every-verb-becomes-a-tool pattern from PR #35. Discoverability win matters for agents that don't know the symbols schema; token savings compound. ~25 LOC registration; reuses the engine helper.

* docs(plans): settle Q-2 — always-wrap {matches, disambiguation?} envelope

Agent-first reframing: 'error by default' was 2023-era reasoning; today's frontier models reason fine over 2-5 candidates given context. Always-wrap gives a single shape to learn / document / test, plus forward extensibility for future disambiguation aids (nearest_to_cursor, most_recently_modified, caller_count) without breaking the contract.

Single match: {matches: [{...}]}. Multi-match: {matches: [...], disambiguation: {n, by_kind, files, hint}}. Agent reads result.matches[0] either way.

* docs(plans): settle Q-3 — exact match only; fuzzy stays in query

show contract is sharp: 'I know the name → I want to know where it lives.' Agents have the exact name 95% of the time (stack traces, import statements, prior query results). Error message points at query+LIKE for fuzzy so the agent's next move is explicit. Avoids burning a flag on a feature query already does.

* docs(plans): settle Q-4 — ship --in <path> file-scope filter

Closes the loop with the Q-2 disambiguation envelope: agent sees candidate files in disambiguation.files, narrows with --in via parameter add (not tool-switch to query). --kind handles 'function vs const' ambiguity; --in handles 'this folder vs that folder' (the common case). ~5 LOC. Match rule: prefix if ends with / or names a directory, else exact file.

* docs(plans): expand to show + snippet, settle Q-5, open Q-6, fold fact-check refinements

After fact-checking against the refreshed codemap index, snippet's marginal cost is smaller than initially framed:

- findSymbolsByName (Q-1 helper) is shared with show — free reuse
- readFileSync + toProjectRelative + hashContent + files.content_hash IS the literal pattern cmd-validate.ts already uses for stale detection — pure copy-paste
- ~2-3 hours marginal cost on top of show; splitting into a follow-up PR would duplicate docs / changeset / Rule-10 mirror overhead

Q-5 settled: ship snippet alongside show in v1. Output is {matches: [{...metadata, source, stale?}]} — additive on Q-2's envelope, no shape divergence.

Q-2 updated: explicit requirement that BOTH the CLI's --json mode AND the MCP tool wrap in {matches, disambiguation?} — required to preserve plan §4 uniformity (CLI prints array AND MCP returns envelope = uniformity broken).

Q-4 updated: --in <path> normalization via existing toProjectRelative(projectRoot, p) helper (verified — already handles leading ./, trailing /, Windows backslash → POSIX). No reinventing.

Q-6 opened: stale-file behavior for snippet — read+flag (1) vs refuse (2) vs auto-reindex (3). Bias toward (1) per agent-first lens (no hostile round-trip, no hidden side-effects).

Tracer-bullet sequence expanded from 4 → 6 commits (~1 day total). Non-goals updated: snippet no longer deferred; --with-source flag explicitly rejected per Q-5; auto-reindex on stale explicitly rejected pending Q-6 confirmation; glob characters in --in explicitly out of scope.

* docs(plans): settle Q-6 — read + flag stale snippets

Agent-first: gives data + structured warning; preserves agent autonomy (e.g. 'I want stale to compare with what changed'). Refuse + auto-reindex both rejected — refuse forces 3 round-trips for content already on disk; auto-reindex hides side-effects from a read tool and breaks the read/write separation we kept clean across PRs #33 / #35 / #37.

All 6 grill questions now settled — ready for tracer 1.

* feat(show): show-engine.ts findSymbolsByName + tests (Tracer 1 of 6)

Pure transport-agnostic lookup engine — same shape audit-engine.ts / query-engine.ts use (PRs #33 / #35). findSymbolsByName({db, name, kind?, inPath?}) returns SymbolMatch[] with deterministic order (file_path ASC, line_start ASC) so callers slice for stable disambiguation output.

Per Q-3 settled: name match is case-sensitive (exact). Per Q-4 settled: inPath uses a directory-vs-file heuristic — trailing slash OR no extension in trailing segment treats as prefix (LIKE 'src/cli/%'); else exact file match (file_path = ?). Caller normalizes via toProjectRelative before passing.

12 unit tests cover: single match, unknown name, ambiguous (3-match deterministic order), kind filter narrowing, inPath as directory (no slash + with slash), inPath as file (exact + miss), kind+inPath compose AND, returned columns, case-sensitivity.

Reuses the symbols table directly. No schema change. Tracer 2 wires the CLI verb on top.

* feat(show): codemap show <name> CLI verb (Tracer 2 of 6)

Implements the show CLI verb per the settled grill round:

- parseShowRest — argv parser supporting <name> + --kind + --in + --json (+ --help / -h). Errors on missing name, extra positional, unknown flags, and missing flag values.
- buildShowResult — wraps engine output in the {matches, disambiguation?} envelope (Q-2 settled). Single-match → {matches}; multi-match adds n / by_kind / files / hint structured aids.
- runShowCmd — bootstraps codemap, normalizes --in via toProjectRelative (Q-4), runs findSymbolsByName, renders. JSON mode prints the envelope verbatim; terminal mode prints path:line-line + signature per row + a stderr disambiguation hint on multi-match.
- Error UX (Q-3): unknown name → routed-error message pointing at `codemap query --json "SELECT … LIKE '%name%'"` so the agent's next step is explicit.

Wired into main.ts dispatch + bootstrap.ts validateIndexModeArgs known-verbs list + help text. toProjectRelative exported from cmd-validate.ts (was private).

13 unit tests cover parser (help/missing/extra/unknown-flag/--kind/--in/order-independence/throws-if-not-show) + buildShowResult envelope (single / zero / multi / file dedup).

Smoke tested: show runQueryCmd / --json / --in / unknown-name all behave per spec.

* feat(show): readSymbolSource + getIndexedContentHash with stale detection (Tracer 3 of 6)

Adds the snippet-side engine helpers per Q-5 (ship snippet alongside show) + Q-6 (read + flag stale, never refuse + never auto-reindex):

- readSymbolSource({match, projectRoot, indexedContentHash?}) returns {source, stale, missing}. Reuses readFileSync + hashContent + the same FS pattern cmd-validate.ts uses (verified during fact-check). Line slicing is 1-indexed inclusive matching symbols.line_start/line_end. Clamps line_end past EOF instead of throwing.

- getIndexedContentHash(db, filePath) — convenience helper for the same SELECT cmd-validate.ts uses.

Stale semantics (Q-6): source is ALWAYS returned when the file exists; stale: true is just a metadata flag the agent reads. Missing file → {source: undefined, stale: true, missing: true}. indexedContentHash undefined → never marks stale (caller opts out of staleness checks).

7 new unit tests cover line slicing happy path, missing file, hash-match (stale: false), hash-mismatch (stale: true + source still returned), EOF clamping, opt-out via undefined hash, and getIndexedContentHash lookup. Total now 19 pass on show-engine.

Tracer 4 next: cmd-snippet.ts CLI verb on top of these helpers.

* feat(snippet): codemap snippet <name> CLI verb (Tracer 4 of 6)

Sibling to show: same lookup contract (name + kind + in + json) but returns source text from disk per match. Output envelope: {matches: [{...metadata, source, stale, missing}], disambiguation?: {...}} — additive on Q-2's envelope (one source/stale/missing field per row, never a shape divergence).

- parseSnippetRest mirrors parseShowRest's parser (same flags, same errors).
- buildSnippetResult enriches each SymbolMatch with source/stale/missing via getIndexedContentHash + readSymbolSource (Tracer 3 helpers). Per Q-6: source ALWAYS returned when file exists; stale/missing are pure metadata flags the agent reads.
- runSnippetCmd mirrors runShowCmd's bootstrap + lookup + render. Terminal mode prints path:line-line[STALE/MISSING flags] + source; --json mode emits the envelope verbatim. Stderr hint when any row is stale points at codemap / codemap --files <path> for refresh.

Wired into main.ts dispatch + bootstrap.ts known-verbs + help text.

11 unit tests cover parser (help/missing/extra/unknown/--kind/--in/order/throws-not-snippet) + buildSnippetResult (single match w/ source, stale flag on hash drift, missing flag on rm'd file, multi-match disambiguation envelope).

Smoke tested: bun src/index.ts snippet runQueryCmd --json returns the function source + metadata + stale: false.

* feat(mcp): show + snippet MCP tools (Tracer 5 of 6)

Wires the show + snippet CLI verbs as MCP tools per Q-1 settled. Both follow the established cmd-* ↔ register*Tool pattern from PR #35; both reuse the same engine helpers (findSymbolsByName, buildShowResult, buildSnippetResult) so output shape is verbatim from each tool's CLI counterpart's --json envelope.

- registerShowTool — args {name, kind?, in?}, returns the {matches, disambiguation?} envelope. Tool description teaches: 'Use snippet for source text; use query with LIKE for fuzzy lookup' so agents know when to reach for which tool.

- registerSnippetTool — args {name, kind?, in?}, returns the same envelope with source/stale/missing on each match. Description spells out the stale semantics (read + flag, agent decides) since that's the one non-obvious bit.

Both tools route the in arg through toProjectRelative(opts.root, args.in) so MCP callers get the same path-shape leniency as the CLI (--in ./src/cli/, --in src/cli, --in src/cli/cmd-show.ts all work identically).

8 new in-process MCP tests via @modelcontextprotocol/sdk's InMemoryTransport: tools/list lists both, single-match envelope, multi-match disambiguation, in-filter narrows, unknown-name returns empty, snippet source on fresh file (stale: false), stale flag on hash drift, missing flag on rm'd file.

Total now 38 MCP tests pass.

* docs(show + snippet): architecture / glossary / README / agent rule + skill (Tracer 6 of 6)

Lifts canonical bits out of docs/plans/targeted-read-cli.md per docs/README.md Rule 2 (delete plans on ship). Surfaces touched:

- architecture.md § CLI usage gains a 'Show / snippet wiring' paragraph documenting the cmd-show ↔ cmd-snippet ↔ show-engine seam, the {matches, disambiguation?} envelope, the toProjectRelative + hashContent primitive reuse from cmd-validate.ts, and the stale-file behavior (read + flag, no auto-reindex).

- glossary.md § S: new entries 'show' and 'snippet' with disambiguation envelope reference + cross-link to architecture.md.

- roadmap.md: removed the targeted-read-cli backlog entry (now shipped).

- README.md CLI block: added show + snippet examples covering the metadata vs source-text distinction and the disambiguation envelope shape.

- .agents/rules/codemap.md + templates/agents/rules/codemap.md (mirrored per Rule 10): added two CLI table rows (Targeted read metadata, Targeted read source text) + a 'Targeted reads' section documenting the envelope, --kind / --in flags, exact-match semantics, and snippet stale-file behavior.

- .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md (mirrored): MCP tools list extended with show + snippet entries describing args, envelope shape, and stale semantics. Tools list in agent rule extended too.

- docs/plans/targeted-read-cli.md DELETED (Rule 2 — plan content fully lifted into architecture / glossary / agent files).

- Minor changeset added (additive features, no schema breaks).

* chore(security): defence-in-depth fixes from PR self-audit

Three small hygiene fixes from the security audit on PR #39:

1. agents-init.ts relPathToAbsSegments — now rejects '..' and '.' segments instead of just filtering empty strings. Defence in depth: today's callers source rel from listRegularFilesRecursive (package-controlled, never produces '..'), but a future caller passing user-provided relative paths would otherwise allow join(destRoot, '..', 'etc', 'passwd') to write outside destRoot. Throws loud instead of silently writing somewhere unexpected. 5 new unit tests cover happy path, empty-segment filter, '..' at start, '..' in middle, and '.' rejection.

2. cmd-show.ts + cmd-snippet.ts unknown-name error — escapes single-quotes (SQLite '' convention) before embedding the user-provided name into the suggested SQL hint. No execution risk (the message is just text), but the previous version emitted SQL like LIKE '%'; DROP TABLE symbols; --%' which looks injection-y in agent traces and breaks if the agent copy-pastes the hint. Now safe for names like O'Brien.

3. .github/workflows/ci.yml — added an audit job running 'bun audit' on every PR. Marked continue-on-error: true (non-blocking) so transient registry issues or low-severity transitive CVEs don't gate merges. Promote to a hard gate once the team agrees on a vulnerability budget. Verified bun audit works locally + reports zero vulnerabilities today.

All three are tiny, additive, and follow defence-in-depth rather than fixing live exploits — the original audit found no exploitable vulnerabilities in the codebase.

* fix(show): escape SQL LIKE wildcards in --in path (PR #39 CodeRabbit feedback, Major)

Real bug verified against actual SQLite semantics: when --in src/__tests__ became LIKE 'src/__tests__/%', the underscores matched ANY single char so the query also matched src/aatestsZZ/foo.ts. Underscores are ubiquitous in TS layouts (__tests__, __mocks__, _utils, _helpers).

Fix: new escapeLikeLiteral helper escapes _, %, and \ (the escape char itself); the LIKE clause now uses ESCAPE '\'. Trailing % we append stays an unescaped wildcard. Symmetric handling so paths with literal '%' (rare but possible in OS file names) also match exactly.

Tests: 1 integration test seeds both src/__tests__/setup.ts and a same-shape decoy src/aatestsZZ/decoy.ts; --in src/__tests__ now returns only the real one. 4 unit tests cover the escape helper (underscore, percent, backslash, identity).
SutuSebastian added a commit that referenced this pull request May 2, 2026
…er-reversal imports) (#41)

* refactor(audit): lift resolveAuditBaselines cmd-audit → audit-engine (Tracer 1 of 5)

Pure move — function logic unchanged. Closes one of the 5 layer-reversal imports application/mcp-server.ts had on cli/* (called out in PR #35 self-audit). Now MCP server imports resolveAuditBaselines from the engine alongside runAudit, like the proper cmd-* ↔ *-engine seam.

cmd-audit.ts test imports updated; CLI handler still calls the function (now via engine import). No behavior change.

* refactor(recipes): lift cli/query-recipes → application/query-recipes (Tracer 2a of 5)

Pre-req for lifting buildContextEnvelope in Tracer 2b — query-recipes is engine-shaped (no CLI args, just exports for both CLI and MCP), so its location in cli/ was a misfile.

git mv preserves history; only changes are the import paths in 6 callers and the relative paths inside the moved files.

* refactor(context): lift buildContextEnvelope cmd-context → context-engine (Tracer 2b of 5)

buildContextEnvelope, classifyIntent, ContextEnvelope, readScalarInt are pure (DB read + envelope build, no I/O / argv / printing) — engine-shaped. Same pattern as audit-engine in Tracer 1.

cmd-context.ts now holds parse/help/run only. mcp-server imports from the new engine instead of the CLI shell.

* refactor(validate): lift computeValidateRows + toProjectRelative cmd-validate → validate-engine (Tracer 3 of 5)

Both functions are pure (no argv, no printing — just DB rows + filesystem hash compare). toProjectRelative was already public-API for cmd-show / cmd-snippet (cross-CLI reuse) and the MCP show/snippet handlers; this lift removes the cli→cli import edge and the mcp→cli edge in one move.

cmd-validate.ts now holds parse / help / run only.

* refactor(show): lift buildShowResult + buildSnippetResult cmd-show/snippet → show-engine (Tracer 4 of 5)

Closes the last 2 layer-reversal imports application/mcp-server.ts had on cli/* (originally called out in PR #35 self-audit). Verified: `grep 'from "../cli/' src/application` is empty.

Both envelope builders are pure (transform a SymbolMatch[] into a {matches, disambiguation?} shape; snippet additionally enriches via readSymbolSource). They sit naturally next to the engine functions that produce / read their inputs (findSymbolsByName, readSymbolSource, getIndexedContentHash).

cmd-show.ts and cmd-snippet.ts now hold parse/help/run/render only — same cmd-* ↔ *-engine seam Tracers 1-3 established.

* docs: sync architecture.md / research / benchmark to lifted application/* layering (Tracer 5 of 5)

- architecture.md table (line ~23) + 'Repository layout' (line ~100): expand application/ description to enumerate all engines (run-index, index-engine, query-engine, audit-engine, context-engine, validate-engine, show-engine, query-recipes, recipes-loader, mcp-server) + state the never-import-cli/ rule.
- 'Validate / Audit / Context / Show-snippet / Recipes wiring' paragraphs: name the new engine files, point toProjectRelative at validate-engine, point QUERY_RECIPES at application/.
- docs/research/competitive-scan-2026-04.md + fallow.md: dangling src/cli/query-recipes.ts links updated to src/application/query-recipes.ts; context/validate rows now name both shell + engine.
- docs/benchmark.md: getQueryRecipeSql import path updated.
- changeset: patch — internal refactor, no behavior / public API change.

* chore(mcp-server): drop outdated // Layer note (PR #41 obsoleted it)

The note described the old state — engine helpers used to live under src/cli/ and we imported them here as 'pure data'. Tracers 1-4 of this PR lifted every one of them into src/application/, so the note's prediction ("future refactor may lift them...") is now history. With every import below from ./*-engine, the comment carried no info a teammate couldn't re-derive in <30s. User-confirmed delete per preserve-comments Rule 4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant