diff --git a/.agents/rules/codemap.md b/.agents/rules/codemap.md index 68d8b23..44a1a77 100644 --- a/.agents/rules/codemap.md +++ b/.agents/rules/codemap.md @@ -26,6 +26,8 @@ A local database (default **`.codemap.db`**) indexes structure: symbols, imports | List / drop baselines | — | `bun src/index.ts query --baselines` · `bun src/index.ts query --drop-baseline ` | | Per-delta audit | — | `bun src/index.ts audit --json --baseline base` (auto-resolves `base-files` / `base-dependencies` / `base-deprecated`) | | MCP server (for agent hosts) | — | `bun src/index.ts mcp` — JSON-RPC on stdio; one tool per CLI verb. See **MCP** section below. | +| Targeted read (metadata) | — | `bun src/index.ts show [--kind ] [--in ] [--json]` — file:line + signature | +| Targeted read (source text) | — | `bun src/index.ts snippet [--kind ] [--in ] [--json]` — same lookup + source from disk + stale flag | **Recipe `actions`:** with **`--json`**, recipes that define an `actions` template append it to every row (kebab-case verb + description — e.g. `fan-out` → `review-coupling`). Under `--baseline`, actions attach to the **`added`** rows only. Inspect via **`--recipes-json`**. Ad-hoc SQL never carries actions. @@ -48,9 +50,11 @@ Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/ **Audit (`bun src/index.ts audit`)**: structural-drift command; emits `{head, deltas: {files, dependencies, deprecated}}` (each delta carries its own `base` metadata). Reuses B.6 baselines as the snapshot source. Two CLI shapes — `--baseline ` auto-resolves `-files` / `-dependencies` / `-deprecated`; `---baseline ` is the explicit per-delta override. v1 ships no `verdict` / threshold config — consumers compose `--json` + `jq` for CI exit codes. Auto-runs an incremental index before the diff (use `--no-index` to skip for frozen-DB CI). +**Targeted reads (`show` / `snippet`)**: precise lookup by exact symbol name without composing SQL. `show` returns metadata (`file_path:line_start-line_end` + `signature`); `snippet` returns the source text from disk plus `stale` / `missing` flags. Both share the same flag set (`--kind ` to filter by `symbols.kind`, `--in ` for file-scope filter — directory prefix or exact file). Output envelope is `{matches, disambiguation?}` — single match → `{matches: [{...}]}`; multi-match adds `disambiguation: {n, by_kind, files, hint}` so agents narrow without re-scanning. Name match is exact / case-sensitive — for fuzzy use `query` with `LIKE '%name%'`. Snippet stale-file behavior: `source` is always returned when the file exists; `stale: true` means the line range may have shifted (re-index with `bun src/index.ts` or `--files ` before acting on the source). + **MCP server (`bun src/index.ts mcp`)**: stdio MCP (Model Context Protocol) server — agents call codemap as JSON-RPC tools instead of shelling out to the CLI on every read. v1 ships one tool per CLI verb plus four lazy-cached resources: -- **Tools:** `query` / `query_batch` / `query_recipe` / `audit` / `save_baseline` / `list_baselines` / `drop_baseline` / `context` / `validate`. Snake_case keys (Codemap convention matching MCP spec examples + reference servers — spec is convention-agnostic; CLI stays kebab). +- **Tools:** `query` / `query_batch` / `query_recipe` / `audit` / `save_baseline` / `list_baselines` / `drop_baseline` / `context` / `validate` / `show` / `snippet`. Snake_case keys (Codemap convention matching MCP spec examples + reference servers — spec is convention-agnostic; CLI stays kebab). - **`query_batch` (MCP-only):** N statements in one round-trip. Items are `string | {sql, summary?, changed_since?, group_by?}` — string form inherits batch-wide flag defaults, object form overrides on a per-key basis. Per-statement errors are isolated. - **`save_baseline` (polymorphic):** one tool, `{name, sql? | recipe?}` with runtime exclusivity check (mirrors the CLI's single `--save-baseline=` verb). - **Resources:** `codemap://recipes` (catalog), `codemap://recipes/{id}` (one recipe), `codemap://schema` (live DDL from `sqlite_schema`), `codemap://skill` (bundled SKILL.md text). Lazy-cached on first `read_resource`. diff --git a/.agents/skills/codemap/SKILL.md b/.agents/skills/codemap/SKILL.md index b0ff4cb..b45000d 100644 --- a/.agents/skills/codemap/SKILL.md +++ b/.agents/skills/codemap/SKILL.md @@ -67,6 +67,8 @@ Each emitted delta carries its own `base` metadata so mixed-baseline audits are - **`drop_baseline`** — `{name}`. Returns `{dropped: }` on success or `isError` if the name doesn't exist. - **`context`** — `{compact?, intent?}`. Returns the project-bootstrap envelope (codemap version, schema version, file count, language breakdown, hubs, sample markers). Designed for agent session-start — one call replaces 4-5 `query` calls. - **`validate`** — `{paths?: string[]}`. Compares on-disk SHA-256 to indexed `files.content_hash`; empty `paths` validates everything. Returns rows with status (`ok`/`stale`/`missing`/`unindexed`). +- **`show`** — `{name, kind?, in?}`. Exact, case-sensitive symbol name lookup. Returns `{matches: [{name, kind, file_path, line_start, line_end, signature, ...}], disambiguation?: {n, by_kind, files, hint}}`. Single match → `{matches: [{...}]}`; multi-match adds the disambiguation envelope so you narrow without re-scanning. Fuzzy lookup belongs in `query` with `LIKE`. +- **`snippet`** — `{name, kind?, in?}`. Same lookup as `show` but each match also carries `source` (file lines from disk at `line_start..line_end`), `stale` (true when content_hash drifted since indexing — line range may have shifted), `missing` (true when file is gone). Per Q-6 (settled): `source` is always returned when the file exists; agent decides whether to act on stale content or run `codemap` / `codemap --files ` to re-index first. No auto-reindex side-effects from this read tool. **Resources (lazy-cached on first `read_resource`; constant for server-process lifetime):** diff --git a/.changeset/targeted-read-cli.md b/.changeset/targeted-read-cli.md new file mode 100644 index 0000000..3a990e0 --- /dev/null +++ b/.changeset/targeted-read-cli.md @@ -0,0 +1,49 @@ +--- +"@stainless-code/codemap": minor +--- + +feat(show + snippet): targeted-read CLI verbs + MCP tools + +Two sibling verbs that close the "agent wants to read this thing" loop +without composing SQL: + +- **`codemap show `** — returns metadata + (`file_path:line_start-line_end` + `signature` + `kind`) for the + symbol(s) matching the exact name (case-sensitive). +- **`codemap snippet `** — same lookup; each match also carries + `source` (file lines from disk), `stale` (true when content_hash + drifted since indexing), `missing` (true when file is gone). + +Both share the same flag set (`--kind ` filter, `--in ` file +scope — directory prefix or exact file, normalized via the existing +`toProjectRelative` helper for cross-platform consistency). + +Output is the agent-friendly `{matches, disambiguation?}` envelope on +both CLI `--json` and MCP responses (uniformity contract per the MCP +plan). Single match → `{matches: [{...}]}`; multi-match adds +`disambiguation: {n, by_kind, files, hint}` — structured aids so the +agent narrows without scanning every row. Forward-extensible (future +`nearest_to_cursor` / `most_recently_modified` / `caller_count` fields +land as additive keys). + +MCP tools `show` and `snippet` register parallel to the CLI verbs and +auto-inherit the same envelope shape. + +Stale-file behavior on snippet: `source` is always returned when the +file exists; `stale: true` is metadata the agent reads. No refusal, +no auto-reindex side-effects — read tool stays read-only. + +Architecturally: pure transport-agnostic engine in +`src/application/show-engine.ts` (mirrors the cmd-_ ↔ _-engine seam +from PRs #33 / #35 / #37); thin CLI verbs in `src/cli/cmd-show.ts` + +- `src/cli/cmd-snippet.ts`. Reuses `findSymbolsByName`, `hashContent` + (from `src/hash.ts`), `toProjectRelative` (now exported from + `cmd-validate.ts`), and `files.content_hash` — same primitives the + existing `validate` command already uses for stale detection. No + schema change. + +Test coverage: 19 engine tests (lookup variants, line slicing, stale +detection, missing files), 13 cmd-show parser/envelope tests, 11 +cmd-snippet parser/envelope/stale tests, 8 in-process MCP integration +tests via `@modelcontextprotocol/sdk`'s `InMemoryTransport`. diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 0f156b7..32e1e4c 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -141,6 +141,24 @@ jobs: bun run dev --full bun run benchmark + audit: + # Non-blocking — visibility into transitive-dep CVEs without gating PRs. + # Promote to a hard gate once the team agrees on a vulnerability budget. + name: 🛡 Audit (non-blocking) + needs: skip-ci + if: needs['skip-ci'].outputs.skip != 'true' + runs-on: ubuntu-latest + continue-on-error: true + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: Setup + uses: ./.github/actions/setup + + - name: bun audit + run: bun audit + ci-complete: name: CI complete needs: [skip-ci, format, lint, typecheck, test, build, benchmark] diff --git a/README.md b/README.md index 483b317..6abd670 100644 --- a/README.md +++ b/README.md @@ -118,6 +118,14 @@ echo "SELECT path FROM files WHERE language IN ('ts', 'tsx') AND line_count > 50 > .codemap/recipes/big-ts-files.sql codemap query --recipe big-ts-files # auto-discovered alongside bundled +# Targeted reads — precise lookup by symbol name without composing SQL +codemap show runQueryCmd # metadata: file:line + signature +codemap show foo --kind function --in src/cli # narrow ambiguous matches +codemap snippet runQueryCmd # same lookup + source text from disk +codemap snippet foo --json # {matches: [{...metadata, source, stale, missing}]} +# Output envelope is always {matches, disambiguation?} — single match → {matches: [{...}]}; +# multi-match adds disambiguation: {n, by_kind, files, hint} for agent-friendly narrowing. + # MCP server (Model Context Protocol) — for agent hosts (Claude Code, Cursor, Codex, generic MCP clients) codemap mcp # JSON-RPC on stdio; one tool per CLI verb plus query_batch # Tools: query, query_batch (MCP-only — N statements in one round-trip), query_recipe, audit, diff --git a/docs/architecture.md b/docs/architecture.md index f059bb5..6d2965e 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -125,6 +125,8 @@ A local SQLite database (`.codemap.db`) indexes the project tree and stores stru **Context wiring:** **`src/cli/cmd-context.ts`** — **`buildContextEnvelope`** composes the JSON envelope from existing recipes (`fan-in` for `hubs`, `markers` SELECT for `sample_markers`, `QUERY_RECIPES` map for the catalog). **`classifyIntent`** maps `--for ""` to one of `refactor | debug | test | feature | explore | other` via regex against the trimmed input; whitespace-only intents are rejected. `--compact` drops `hubs` + `sample_markers` and emits one-line JSON; otherwise pretty-prints with 2-space indent. +**Show / snippet wiring:** **`src/cli/cmd-show.ts`** + **`src/cli/cmd-snippet.ts`** — sibling CLI verbs sharing the same parser shape (`` + `--kind` + `--in ` + `--json`) and the pure engine **`src/application/show-engine.ts`** (`findSymbolsByName({db, name, kind?, inPath?})` for the lookup; `readSymbolSource({match, projectRoot, indexedContentHash?})` + `getIndexedContentHash(db, filePath)` for the snippet-side FS read). Both verbs return the same `{matches, disambiguation?}` envelope per plan § 4 uniformity — single match → `{matches: [{...}]}`; multi-match adds `{n, by_kind, files, hint}`. Snippet matches add `source` / `stale` / `missing` fields (additive — no shape divergence). **`--in `** is normalized through `toProjectRelative(projectRoot, p)` (exported from **`src/cli/cmd-validate.ts`**) so `--in ./src/cli/`, `--in src/cli`, and `--in src/cli/cmd-show.ts` all resolve identically. Stale-file behavior on `snippet`: `hashContent` (from **`src/hash.ts`** — same primitive `cmd-validate.ts` uses) compares the on-disk content_hash against `files.content_hash`; mismatch sets `stale: true` but the source IS still returned (read tool, no auto-reindex side-effects). MCP tools `show` and `snippet` register parallel to the CLI surface (see [§ MCP wiring](#cli-usage)). + **Recipes wiring:** **`src/application/recipes-loader.ts`** (pure transport-agnostic loader) + **`src/cli/query-recipes.ts`** (shim — caches the loader output, exposes `getQueryRecipeSql` / `getQueryRecipeActions` / `listQueryRecipeIds` / `listQueryRecipeCatalog` / `getQueryRecipeCatalogEntry`). Recipes live as file pairs: **`.sql`** + optional **`.md`**. The loader reads `templates/recipes/` (bundled, ships in npm package next to `templates/agents/`) and `/.codemap/recipes/` (project-local — root-only resolution per the registry plan, no walk-up). Project recipes win on id collision; entries that override a bundled id carry **`shadows: true`** in the catalog so agents reading `codemap://recipes` at session start see when a recipe behaves differently from the documented bundled version. Per-row **`actions`** templates (kebab-case verb + description) live in YAML frontmatter on each `.md` — uniform shape across bundled + project. Hand-rolled YAML parser scoped to `actions: [{type, auto_fixable?, description?}]` only (no `js-yaml` dep). Load-time validation rejects empty SQL and DML / DDL keywords (`INSERT` / `UPDATE` / `DELETE` / `DROP` / `CREATE` / `ALTER` / `ATTACH` / `DETACH` / `REPLACE` / `TRUNCATE` / `VACUUM` / `PRAGMA`) with recipe-aware error messages — defence in depth alongside the runtime `PRAGMA query_only=1` backstop in `query-engine.ts` (PR #35). `.codemap.db` is gitignored; `.codemap/recipes/` is NOT (verified via `git check-ignore`) — recipes are git-tracked source code authored for human review. **MCP wiring:** **`src/cli/cmd-mcp.ts`** (argv — `--help` only; bootstrap absorbs `--root`/`--config`) + **`src/application/mcp-server.ts`** (engine — tool registry, resource handlers, response composition). Mirrors the `cmd-audit.ts ↔ audit-engine.ts` seam — CLI parses + lifecycle; engine owns the SDK. **`runMcpServer`** bootstraps codemap once at server boot (config + resolver + DB access become module-level state), instantiates `McpServer` from **`@modelcontextprotocol/sdk`**, attaches a **`StdioServerTransport`**, and resolves when stdin closes (clean shutdown). Tool handlers reuse the existing engine entry-points: **`query`** + **`query_recipe`** call **`executeQuery`** in **`src/application/query-engine.ts`** (a pure transport-agnostic engine extracted from `printQueryResult`'s JSON branch — same `[...rows]` / `{count}` / `{group_by, groups}` envelope `--json` would print); **`query_batch`** loops via **`executeQueryBatch`** with batch-wide-defaults + per-statement-overrides (items are `string | {sql, summary?, changed_since?, group_by?}`); **`audit`** runs `resolveAuditBaselines` + `runAudit` from PR #33 unchanged; **`context`** / **`validate`** call `buildContextEnvelope` / `computeValidateRows` (pure functions in `src/cli/cmd-*.ts` — same layer-reversal allowance as `query-recipes`). **`save_baseline`** is one polymorphic tool (`{name, sql? | recipe?}`) with a runtime exclusivity check — mirrors the CLI's single `--save-baseline=` verb. **Tool naming**: snake_case throughout — Codemap convention matching the patterns in MCP spec examples and reference servers (GitHub MCP, Cursor built-ins); the spec itself doesn't mandate it. CLI stays kebab — translation lives at the MCP-arg layer. **Resources** (`codemap://recipes`, `codemap://recipes/{id}`, `codemap://schema`, `codemap://skill`) use **lazy memoisation** — first `read_resource` populates a per-server-instance cache; constant for the server-process lifetime so eager-vs-lazy produce identical observable behavior. `codemap://schema` queries `sqlite_schema` live; `codemap://skill` reads from `resolveAgentsTemplateDir() + skills/codemap/SKILL.md`. Output shape uniformity (plan § 4): every tool returns the JSON envelope its CLI counterpart's `--json` flag prints, surfaced via `content: [{type: "text", text: JSON.stringify(payload)}]`. `--changed-since` git lookups are memoised per `(root, ref)` pair across batch items so a `query_batch` of N items sharing the same ref does one git invocation, not N. Per-statement errors in `query_batch` are isolated — failed statements return `{error}` in their slot while siblings still execute. diff --git a/docs/glossary.md b/docs/glossary.md index 3f650ef..a66d5be 100644 --- a/docs/glossary.md +++ b/docs/glossary.md @@ -368,6 +368,14 @@ Conceptually, the structure of the SQLite database — every table, column, cons Integer constant in `src/db.ts`. Bumped whenever the DDL changes. `createSchema()` reads `meta.schema_version` and triggers a full rebuild on mismatch. +### show + +`codemap show ` — one-step lookup that returns metadata (`file_path:line_start-line_end` + `signature` + `kind`) for the symbol(s) matching `` (exact, case-sensitive). Output is the `{matches, disambiguation?}` envelope (single match → `{matches: [{...}]}`; multi-match adds `disambiguation: {n, by_kind, files, hint}` so agents narrow without scanning every row). Flags: `--kind ` (filter by `symbols.kind`), `--in ` (file-scope filter — directory prefix or exact file). Distinct from **snippet** (returns source text, not just metadata) and from `query` with `WHERE name = ?` (one verb vs SQL composition; see [`architecture.md` § Show / snippet wiring](./architecture.md#cli-usage)). + +### snippet + +`codemap snippet ` — same lookup as **show**, but each match also carries `source` (file lines from disk at `line_start..line_end`), `stale` (true when content_hash drifted since last index — line range may have shifted), and `missing` (true when file is gone). Per-execution shape mirrors `show`'s envelope; source/stale/missing are additive fields. Stale-file behavior: `source` is ALWAYS returned when the file exists; `stale: true` is metadata the agent reads (no refusal, no auto-reindex side-effects from a read tool — agent decides whether to act on possibly-shifted lines or run `codemap` first). See [`architecture.md` § Show / snippet wiring](./architecture.md#cli-usage). + ### skill A `.agents/skills//SKILL.md` file with YAML frontmatter. Longer than a rule; describes a complete agent workflow. Distinct from a **rule** (shorter, normative). diff --git a/docs/roadmap.md b/docs/roadmap.md index f491131..0f5a110 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -39,7 +39,6 @@ Codemap stays a structural-index primitive that other tools can consume. Out of - [ ] **`codemap audit --base `** (v1.x) — worktree+reindex snapshot strategy. v1 shipped `--baseline ` / `---baseline ` (B.6 reuse) — see [`architecture.md` § Audit wiring](./architecture.md#cli-usage). v1.x adds `--base ` for "audit against an arbitrary ref I haven't pre-baselined" (defers worktree spawn + cache decision until a real consumer asks). - [ ] **`codemap audit` verdict + thresholds** (v1.x) — `verdict: "pass" | "warn" | "fail"` driven by `codemap.config.audit.deltas[].{added_max, action}`. Triggers: two consumers ship `jq`-based threshold scripts with similar shapes, OR one consumer asks with a concrete config sketch. Until then, raw deltas + consumer-side `jq` is the CI exit-code idiom. - [ ] **`codemap serve` (HTTP API, v1.x)** — same tool taxonomy + output shape as `codemap mcp` (shipped in v1), exposed over `POST /tool/{name}` with loopback default and optional `--token`. Defer until a concrete non-MCP consumer asks; design points are reserved in [`architecture.md` § MCP wiring](./architecture.md#cli-usage) so HTTP inherits them when its turn comes. -- [ ] **Targeted-read CLI** — `codemap show ` / `codemap snippet ` returns `file_path:line_start-line_end` + `signature` for one symbol. Same data as `SELECT … FROM symbols WHERE name = ?`, but a one-step CLI keeps agents from composing SQL for trivial precise reads - [ ] **Watch mode** for dev — `node:fs.watch` recursive + `--files` re-index loop; Linux `recursive` requires Node 19.1+ - [ ] **Monorepo / workspace awareness** — discover workspaces from `pnpm-workspace.yaml` / `package.json` and index per-workspace dependency graphs - [ ] **Cross-agent handoff artifact** — _speculative_; layered prefix/delta JSON written on session-stop, read on session-start. Complementary to indexing rather than core to it; revisit if user demand emerges diff --git a/src/agents-init.test.ts b/src/agents-init.test.ts index 7e75224..0feca04 100644 --- a/src/agents-init.test.ts +++ b/src/agents-init.test.ts @@ -16,6 +16,7 @@ import { CODMAP_POINTER_END, ensureGitignoreCodemapPattern, listRegularFilesRecursive, + relPathToAbsSegments, resolveAgentsTemplateDir, runAgentsInit, targetsNeedLinkMode, @@ -412,3 +413,37 @@ describe("upsertCodemapPointerFile", () => { } }); }); + +describe("relPathToAbsSegments — defence-in-depth path safety", () => { + it("returns segments for a normal relative path", () => { + expect(relPathToAbsSegments("rules/codemap.md")).toEqual([ + "rules", + "codemap.md", + ]); + }); + + it("filters empty segments (leading / trailing / double slashes)", () => { + expect(relPathToAbsSegments("/rules//codemap.md/")).toEqual([ + "rules", + "codemap.md", + ]); + }); + + it("rejects `..` segment", () => { + expect(() => relPathToAbsSegments("../etc/passwd")).toThrow( + /refusing path with ".." segment/, + ); + }); + + it("rejects `..` segment in the middle of the path", () => { + expect(() => relPathToAbsSegments("rules/../../etc/passwd")).toThrow( + /refusing path with ".." segment/, + ); + }); + + it("rejects `.` segment", () => { + expect(() => relPathToAbsSegments("rules/./codemap.md")).toThrow( + /refusing path with "." segment/, + ); + }); +}); diff --git a/src/agents-init.ts b/src/agents-init.ts index a2dbe8a..e366edc 100644 --- a/src/agents-init.ts +++ b/src/agents-init.ts @@ -50,8 +50,23 @@ export function listRegularFilesRecursive( return out; } -function relPathToAbsSegments(rel: string): string[] { - return rel.split("/").filter(Boolean); +/** + * Split a `/`-relative path into segments, rejecting `..` / `.` so callers + * can't `join(destRoot, ...)` into a path that escapes `destRoot`. Defence + * in depth — today's callers source `rel` from `listRegularFilesRecursive` + * (package-controlled, never produces `..`); throwing surfaces future + * regressions loudly instead of silently writing outside the dest. + */ +export function relPathToAbsSegments(rel: string): string[] { + const segments = rel.split("/").filter((s) => s.length > 0); + for (const seg of segments) { + if (seg === ".." || seg === ".") { + throw new Error( + `relPathToAbsSegments: refusing path with "${seg}" segment: ${JSON.stringify(rel)}`, + ); + } + } + return segments; } /** Copy only listed relative paths from `srcRoot` into `destRoot` (mkdir parents per file). */ diff --git a/src/application/mcp-server.test.ts b/src/application/mcp-server.test.ts index 45b08ed..6400f30 100644 --- a/src/application/mcp-server.test.ts +++ b/src/application/mcp-server.test.ts @@ -625,3 +625,193 @@ describe("MCP server — resources", () => { } }); }); + +describe("MCP server — show + snippet tools", () => { + function seedSymbol(opts: { + file: string; + name: string; + kind?: string; + lineStart?: number; + lineEnd?: number; + }) { + const db = openDb(); + try { + db.run( + `INSERT INTO symbols (file_path, name, kind, line_start, line_end, signature, is_exported, is_default_export) + VALUES (?, ?, ?, ?, ?, ?, 1, 0)`, + [ + opts.file, + opts.name, + opts.kind ?? "function", + opts.lineStart ?? 1, + opts.lineEnd ?? 1, + `${opts.kind ?? "function"} ${opts.name}(): void`, + ], + ); + } finally { + closeDb(db); + } + } + + it("lists show + snippet in tools/list", async () => { + const { client, server } = await makeClient(); + try { + const tools = await client.listTools(); + const names = tools.tools.map((t) => t.name); + expect(names).toContain("show"); + expect(names).toContain("snippet"); + } finally { + await server.close(); + } + }); + + it("show returns {matches} envelope for single match", async () => { + seedSymbol({ file: "src/a.ts", name: "myFn", lineStart: 5, lineEnd: 10 }); + const { client, server } = await makeClient(); + try { + const r = await client.callTool({ + name: "show", + arguments: { name: "myFn" }, + }); + const json = readJson(r); + expect(json.matches).toHaveLength(1); + expect(json.matches[0]).toMatchObject({ + name: "myFn", + file_path: "src/a.ts", + line_start: 5, + line_end: 10, + }); + expect(json.disambiguation).toBeUndefined(); + } finally { + await server.close(); + } + }); + + it("show adds disambiguation block for multi-match", async () => { + seedSymbol({ file: "src/a.ts", name: "shared", kind: "function" }); + seedSymbol({ file: "src/b.ts", name: "shared", kind: "const" }); + const { client, server } = await makeClient(); + try { + const r = await client.callTool({ + name: "show", + arguments: { name: "shared" }, + }); + const json = readJson(r); + expect(json.matches).toHaveLength(2); + expect(json.disambiguation).toMatchObject({ + n: 2, + by_kind: { function: 1, const: 1 }, + files: ["src/a.ts", "src/b.ts"], + }); + } finally { + await server.close(); + } + }); + + it("show with `in` filter narrows to one file", async () => { + seedSymbol({ file: "src/a.ts", name: "shared" }); + seedSymbol({ file: "src/b.ts", name: "shared" }); + const { client, server } = await makeClient(); + try { + const r = await client.callTool({ + name: "show", + arguments: { name: "shared", in: "src/a.ts" }, + }); + const json = readJson(r); + expect(json.matches).toHaveLength(1); + expect(json.matches[0].file_path).toBe("src/a.ts"); + } finally { + await server.close(); + } + }); + + it("show returns empty matches when name unknown", async () => { + const { client, server } = await makeClient(); + try { + const r = await client.callTool({ + name: "show", + arguments: { name: "definitely-not-a-real-symbol-xyz" }, + }); + const json = readJson(r); + expect(json.matches).toEqual([]); + } finally { + await server.close(); + } + }); + + it("snippet returns source text from disk + stale: false on fresh file", async () => { + // Write a real file matching the seeded `files` row in the bench setup + // (src/a.ts already exists with hash 'h1' but content "export const A = 1;\n"). + // Seed a symbol pointing at line 1. + seedSymbol({ + file: "src/a.ts", + name: "A", + kind: "const", + lineStart: 1, + lineEnd: 1, + }); + // The bench uses content_hash = 'h1' which DOES NOT match hashContent("export const A = 1;\n"), + // so the engine will report stale: true. To test stale: false we'd need to update the row's hash. + const db = openDb(); + try { + const realHash = ( + require("../hash") as typeof import("../hash") + ).hashContent("export const A = 1;\n"); + db.run("UPDATE files SET content_hash = ? WHERE path = ?", [ + realHash, + "src/a.ts", + ]); + } finally { + closeDb(db); + } + const { client, server } = await makeClient(); + try { + const r = await client.callTool({ + name: "snippet", + arguments: { name: "A" }, + }); + const json = readJson(r); + expect(json.matches).toHaveLength(1); + expect(json.matches[0].source).toBe("export const A = 1;"); + expect(json.matches[0].stale).toBe(false); + expect(json.matches[0].missing).toBe(false); + } finally { + await server.close(); + } + }); + + it("snippet flags stale: true when on-disk content drifts from indexed hash", async () => { + // Bench file content is "export const A = 1;\n" but indexed hash is 'h1' (mismatch). + seedSymbol({ file: "src/a.ts", name: "A", lineStart: 1, lineEnd: 1 }); + const { client, server } = await makeClient(); + try { + const r = await client.callTool({ + name: "snippet", + arguments: { name: "A" }, + }); + const json = readJson(r); + expect(json.matches[0].stale).toBe(true); + // Source is still returned per Q-6 settled. + expect(json.matches[0].source).toBe("export const A = 1;"); + } finally { + await server.close(); + } + }); + + it("snippet flags missing: true when file is gone on disk", async () => { + seedSymbol({ file: "src/b.ts", name: "B", lineStart: 1, lineEnd: 1 }); + // src/b.ts is in the indexed `files` but no actual file on disk in bench setup. + const { client, server } = await makeClient(); + try { + const r = await client.callTool({ + name: "snippet", + arguments: { name: "B" }, + }); + const json = readJson(r); + expect(json.matches[0].missing).toBe(true); + expect(json.matches[0].source).toBeUndefined(); + } finally { + await server.close(); + } + }); +}); diff --git a/src/application/mcp-server.ts b/src/application/mcp-server.ts index 402425e..3f3d0fa 100644 --- a/src/application/mcp-server.ts +++ b/src/application/mcp-server.ts @@ -17,7 +17,9 @@ import { resolveAgentsTemplateDir } from "../agents-init"; // once a second consumer (HTTP API) needs them. import { resolveAuditBaselines } from "../cli/cmd-audit"; import { buildContextEnvelope } from "../cli/cmd-context"; -import { computeValidateRows } from "../cli/cmd-validate"; +import { buildShowResult } from "../cli/cmd-show"; +import { buildSnippetResult } from "../cli/cmd-snippet"; +import { computeValidateRows, toProjectRelative } from "../cli/cmd-validate"; import { getQueryRecipeActions, getQueryRecipeCatalogEntry, @@ -41,6 +43,7 @@ import { runAudit } from "./audit-engine"; import { getCurrentCommit } from "./index-engine"; import { executeQuery } from "./query-engine"; import { runCodemapIndex } from "./run-index"; +import { findSymbolsByName } from "./show-engine"; /** * MCP server engine — owns the tool / resource registry. CLI shell @@ -154,6 +157,8 @@ export function createMcpServer(opts: ServerOpts): McpServer { registerSaveBaselineTool(server, opts); registerListBaselinesTool(server, opts); registerDropBaselineTool(server, opts); + registerShowTool(server, opts); + registerSnippetTool(server, opts); registerResources(server); return server; @@ -570,6 +575,80 @@ function registerDropBaselineTool(server: McpServer, _opts: ServerOpts): void { ); } +function registerShowTool(server: McpServer, opts: ServerOpts): void { + server.registerTool( + "show", + { + description: + "Look up symbol(s) by exact name; returns {matches: [{name, kind, file_path, line_start, line_end, signature, ...}]} with structured `disambiguation` block when multiple matches. One-step lookup that beats composing `SELECT … FROM symbols WHERE name = ?` by hand. Use `snippet` for the actual source text; use `query` with `LIKE` for fuzzy lookup.", + inputSchema: { + name: z.string().min(1, "name must be a non-empty string"), + kind: z.string().optional(), + in: z.string().optional(), + }, + }, + (args) => { + try { + const db = openDb(); + try { + const inPath = + args.in !== undefined && args.in.length > 0 + ? toProjectRelative(opts.root, args.in) + : undefined; + const matches = findSymbolsByName(db, { + name: args.name, + kind: args.kind, + inPath, + }); + return jsonResult(buildShowResult(matches)); + } finally { + closeDb(db, { readonly: true }); + } + } catch (err) { + return jsonError(err instanceof Error ? err.message : String(err)); + } + }, + ); +} + +function registerSnippetTool(server: McpServer, opts: ServerOpts): void { + server.registerTool( + "snippet", + { + description: + "Same lookup as `show` but each match carries `source` (file lines from disk at line_start..line_end) plus `stale` (true when content_hash drifted since indexing — line range may have shifted; agent decides whether to act or re-index) and `missing` (true when file is gone). Per-execution shape mirrors `show`'s envelope; source/stale/missing are additive fields on each match.", + inputSchema: { + name: z.string().min(1, "name must be a non-empty string"), + kind: z.string().optional(), + in: z.string().optional(), + }, + }, + (args) => { + try { + const db = openDb(); + try { + const inPath = + args.in !== undefined && args.in.length > 0 + ? toProjectRelative(opts.root, args.in) + : undefined; + const matches = findSymbolsByName(db, { + name: args.name, + kind: args.kind, + inPath, + }); + return jsonResult( + buildSnippetResult({ db, matches, projectRoot: opts.root }), + ); + } finally { + closeDb(db, { readonly: true }); + } + } catch (err) { + return jsonError(err instanceof Error ? err.message : String(err)); + } + }, + ); +} + /** * MCP resources are addressable read-only data the host can fetch ahead of * tool calls. Plan § 7 + grill round Q3 settled on **lazy memoisation**: diff --git a/src/application/show-engine.test.ts b/src/application/show-engine.test.ts new file mode 100644 index 0000000..f2b5333 --- /dev/null +++ b/src/application/show-engine.test.ts @@ -0,0 +1,325 @@ +import { afterEach, beforeEach, describe, expect, it } from "bun:test"; +import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +import { createTables } from "../db"; +import type { CodemapDatabase } from "../db"; +import { hashContent } from "../hash"; +import { openCodemapDatabase } from "../sqlite-db"; +import { + escapeLikeLiteral, + findSymbolsByName, + getIndexedContentHash, + readSymbolSource, +} from "./show-engine"; +import type { SymbolMatch } from "./show-engine"; + +let db: CodemapDatabase; + +beforeEach(() => { + db = openCodemapDatabase(":memory:"); + createTables(db); + // Seed a `files` row first so `symbols.file_path` foreign keys resolve. + db.run( + "INSERT INTO files (path, content_hash, size, line_count, language, last_modified, indexed_at) VALUES (?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?)", + [ + "src/cli/cmd-show.ts", + "h1", + 100, + 30, + "ts", + 1, + 1, + "src/legacy/foo.ts", + "h2", + 80, + 20, + "ts", + 1, + 1, + "src/test/fixtures.ts", + "h3", + 50, + 15, + "ts", + 1, + 1, + ], + ); + // Three symbols named `foo` across two files + a kind variation. + db.run( + `INSERT INTO symbols (file_path, name, kind, line_start, line_end, signature, is_exported, is_default_export) + VALUES + ('src/cli/cmd-show.ts', 'foo', 'function', 5, 15, 'function foo(): void', 1, 0), + ('src/legacy/foo.ts', 'foo', 'function', 1, 50, 'function foo(arg: string): number', 0, 0), + ('src/test/fixtures.ts','foo', 'const', 3, 3, 'const foo = 42', 1, 0), + ('src/cli/cmd-show.ts', 'bar', 'function', 20, 25,'function bar(): string', 1, 0)`, + ); +}); + +afterEach(() => { + db.close(); +}); + +describe("findSymbolsByName", () => { + it("returns single match for a unique name", () => { + const r = findSymbolsByName(db, { name: "bar" }); + expect(r).toHaveLength(1); + expect(r[0]).toMatchObject({ + name: "bar", + kind: "function", + file_path: "src/cli/cmd-show.ts", + line_start: 20, + line_end: 25, + }); + }); + + it("returns empty array for unknown name", () => { + expect(findSymbolsByName(db, { name: "no-such-symbol" })).toEqual([]); + }); + + it("returns all matches for an ambiguous name (deterministic order)", () => { + const r = findSymbolsByName(db, { name: "foo" }); + expect(r).toHaveLength(3); + // Ordered by file_path ASC, line_start ASC. + expect(r.map((m) => m.file_path)).toEqual([ + "src/cli/cmd-show.ts", + "src/legacy/foo.ts", + "src/test/fixtures.ts", + ]); + }); + + it("filters by kind when set", () => { + const r = findSymbolsByName(db, { name: "foo", kind: "const" }); + expect(r).toHaveLength(1); + expect(r[0]!.file_path).toBe("src/test/fixtures.ts"); + }); + + it("kind=function narrows ambiguous name to 2 matches", () => { + const r = findSymbolsByName(db, { name: "foo", kind: "function" }); + expect(r).toHaveLength(2); + expect(r.map((m) => m.file_path)).toEqual([ + "src/cli/cmd-show.ts", + "src/legacy/foo.ts", + ]); + }); + + it("inPath as directory (no extension) treats as prefix", () => { + const r = findSymbolsByName(db, { name: "foo", inPath: "src/cli" }); + expect(r).toHaveLength(1); + expect(r[0]!.file_path).toBe("src/cli/cmd-show.ts"); + }); + + it("inPath with trailing slash treats as prefix", () => { + const r = findSymbolsByName(db, { name: "foo", inPath: "src/legacy/" }); + expect(r).toHaveLength(1); + expect(r[0]!.file_path).toBe("src/legacy/foo.ts"); + }); + + it("inPath with file extension treats as exact match", () => { + const r = findSymbolsByName(db, { + name: "foo", + inPath: "src/test/fixtures.ts", + }); + expect(r).toHaveLength(1); + expect(r[0]!.kind).toBe("const"); + }); + + it("inPath exact-match misses when path doesn't match", () => { + const r = findSymbolsByName(db, { + name: "foo", + inPath: "src/test/other.ts", + }); + expect(r).toEqual([]); + }); + + it("inPath + kind compose (AND, not OR)", () => { + const r = findSymbolsByName(db, { + name: "foo", + kind: "function", + inPath: "src/cli", + }); + expect(r).toHaveLength(1); + expect(r[0]!.file_path).toBe("src/cli/cmd-show.ts"); + }); + + it("returns kind/visibility/parent_name fields", () => { + const r = findSymbolsByName(db, { name: "bar" }); + expect(r[0]).toMatchObject({ + kind: "function", + visibility: null, + parent_name: null, + is_exported: 1, + }); + }); + + it("name match is case-sensitive", () => { + expect(findSymbolsByName(db, { name: "FOO" })).toEqual([]); + expect(findSymbolsByName(db, { name: "Foo" })).toEqual([]); + }); + + it("inPath with `_` matches the literal directory, not via LIKE wildcard", () => { + // Seed two files: real `__tests__` directory + a same-shape `aatestsZZ` + // that would over-match if `_` were treated as a SQL LIKE wildcard. + db.run( + "INSERT INTO files (path, content_hash, size, line_count, language, last_modified, indexed_at) VALUES (?, ?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?, ?)", + [ + "src/__tests__/setup.ts", + "h-test", + 10, + 3, + "ts", + 1, + 1, + "src/aatestsZZ/decoy.ts", + "h-decoy", + 10, + 3, + "ts", + 1, + 1, + ], + ); + db.run( + `INSERT INTO symbols (file_path, name, kind, line_start, line_end, signature, is_exported, is_default_export) + VALUES + ('src/__tests__/setup.ts', 'shared', 'function', 1, 1, 'function shared(): void', 1, 0), + ('src/aatestsZZ/decoy.ts', 'shared', 'function', 1, 1, 'function shared(): void', 1, 0)`, + ); + const r = findSymbolsByName(db, { + name: "shared", + inPath: "src/__tests__", + }); + expect(r).toHaveLength(1); + expect(r[0]!.file_path).toBe("src/__tests__/setup.ts"); + }); +}); + +describe("escapeLikeLiteral", () => { + it("escapes underscores", () => { + expect(escapeLikeLiteral("foo_bar")).toBe("foo\\_bar"); + }); + it("escapes percents", () => { + expect(escapeLikeLiteral("100%")).toBe("100\\%"); + }); + it("escapes the backslash escape char itself", () => { + expect(escapeLikeLiteral("a\\b")).toBe("a\\\\b"); + }); + it("leaves ordinary characters alone", () => { + expect(escapeLikeLiteral("src/cli/cmd-show.ts")).toBe( + "src/cli/cmd-show.ts", + ); + }); +}); + +describe("readSymbolSource — line slicing + stale detection (Q-6)", () => { + let projectRoot: string; + + function makeMatch( + file: string, + lineStart: number, + lineEnd: number, + ): SymbolMatch { + return { + name: "x", + kind: "function", + file_path: file, + line_start: lineStart, + line_end: lineEnd, + signature: "function x(): void", + is_exported: 0, + parent_name: null, + visibility: null, + }; + } + + beforeEach(() => { + projectRoot = mkdtempSync(join(tmpdir(), "show-engine-source-")); + mkdirSync(join(projectRoot, "src"), { recursive: true }); + }); + + afterEach(() => { + rmSync(projectRoot, { recursive: true, force: true }); + }); + + it("slices lines 1-indexed inclusive", () => { + const text = "line 1\nline 2\nline 3\nline 4\nline 5\n"; + writeFileSync(join(projectRoot, "src/x.ts"), text); + const r = readSymbolSource({ + match: makeMatch("src/x.ts", 2, 4), + projectRoot, + }); + expect(r.source).toBe("line 2\nline 3\nline 4"); + expect(r.stale).toBe(false); + expect(r.missing).toBe(false); + }); + + it("flags missing file with stale: true + missing: true", () => { + const r = readSymbolSource({ + match: makeMatch("src/nope.ts", 1, 5), + projectRoot, + }); + expect(r.source).toBeUndefined(); + expect(r.missing).toBe(true); + expect(r.stale).toBe(true); + }); + + it("returns stale: false when content_hash matches indexed value", () => { + const text = "fresh content\n"; + writeFileSync(join(projectRoot, "src/x.ts"), text); + const r = readSymbolSource({ + match: makeMatch("src/x.ts", 1, 1), + projectRoot, + indexedContentHash: hashContent(text), + }); + expect(r.stale).toBe(false); + expect(r.source).toBe("fresh content"); + }); + + it("returns stale: true when content has changed since index", () => { + writeFileSync(join(projectRoot, "src/x.ts"), "old\n"); + const oldHash = hashContent("old\n"); + writeFileSync(join(projectRoot, "src/x.ts"), "modified\n"); + const r = readSymbolSource({ + match: makeMatch("src/x.ts", 1, 1), + projectRoot, + indexedContentHash: oldHash, + }); + expect(r.stale).toBe(true); + // Source still returned (Q-6 settled — read + flag). + expect(r.source).toBe("modified"); + }); + + it("clamps line_end past EOF instead of throwing", () => { + writeFileSync(join(projectRoot, "src/x.ts"), "only line\n"); + const r = readSymbolSource({ + match: makeMatch("src/x.ts", 1, 999), + projectRoot, + }); + expect(r.source).toBe("only line\n"); // includes the trailing newline split + }); + + it("indexedContentHash undefined → never marks stale", () => { + writeFileSync(join(projectRoot, "src/x.ts"), "anything\n"); + const r = readSymbolSource({ + match: makeMatch("src/x.ts", 1, 1), + projectRoot, + }); + expect(r.stale).toBe(false); + }); +}); + +describe("getIndexedContentHash", () => { + it("returns the stored hash for an indexed path", () => { + const fresh = openCodemapDatabase(":memory:"); + createTables(fresh); + fresh.run( + "INSERT INTO files (path, content_hash, size, line_count, language, last_modified, indexed_at) VALUES (?, ?, ?, ?, ?, ?, ?)", + ["src/a.ts", "abc123", 10, 1, "ts", 1, 1], + ); + expect(getIndexedContentHash(fresh, "src/a.ts")).toBe("abc123"); + expect(getIndexedContentHash(fresh, "src/missing.ts")).toBeUndefined(); + fresh.close(); + }); +}); diff --git a/src/application/show-engine.ts b/src/application/show-engine.ts new file mode 100644 index 0000000..798a5f9 --- /dev/null +++ b/src/application/show-engine.ts @@ -0,0 +1,174 @@ +import { existsSync, readFileSync } from "node:fs"; +import { join } from "node:path"; + +import type { CodemapDatabase } from "../db"; +import { hashContent } from "../hash"; + +/** + * One row from the `symbols` table — the canonical match shape returned by + * `findSymbolsByName`. Same columns the CLI / MCP `show` verbs surface in + * their `--json` envelopes, plus the always-present `signature` so an agent + * can disambiguate without a follow-up read. + */ +export interface SymbolMatch { + name: string; + kind: string; + file_path: string; + line_start: number; + line_end: number; + signature: string; + is_exported: number; + parent_name: string | null; + visibility: string | null; +} + +export interface FindSymbolsOpts { + /** Exact symbol name (case-sensitive — per plan §9 Q-3). */ + name: string; + /** Optional `symbols.kind` filter (e.g. "function", "const", "class"). */ + kind?: string | undefined; + /** + * Optional file-scope filter. If `` ends with `/` or matches a + * directory shape, treats as prefix (`AND file_path LIKE 'src/cli/%'`); + * otherwise exact match (`AND file_path = 'src/cli/cmd-show.ts'`). + * Caller should normalize via `toProjectRelative` before passing — this + * engine does no path-shape massaging beyond the prefix/exact split. + */ + inPath?: string | undefined; +} + +/** + * Pure transport-agnostic lookup — same shape `cmd-show.ts` and the MCP + * `show` tool both call. Mirrors the `audit-engine.ts` / `query-engine.ts` + * pattern from PRs #33 / #35. + * + * Returns rows ordered deterministically (`file_path ASC, line_start ASC`) + * so callers can slice the array and get stable disambiguation output. + */ +export function findSymbolsByName( + db: CodemapDatabase, + opts: FindSymbolsOpts, +): SymbolMatch[] { + const clauses: string[] = ["name = ?"]; + const params: (string | number)[] = [opts.name]; + + if (opts.kind !== undefined && opts.kind.length > 0) { + clauses.push("kind = ?"); + params.push(opts.kind); + } + + if (opts.inPath !== undefined && opts.inPath.length > 0) { + if (looksLikeDirectory(opts.inPath)) { + const prefix = opts.inPath.endsWith("/") + ? opts.inPath + : `${opts.inPath}/`; + // Escape user input so `src/__tests__` doesn't over-match via SQL + // LIKE's `_`-matches-any-char rule. Trailing `%` stays a wildcard. + clauses.push("file_path LIKE ? ESCAPE '\\'"); + params.push(`${escapeLikeLiteral(prefix)}%`); + } else { + clauses.push("file_path = ?"); + params.push(opts.inPath); + } + } + + const sql = `SELECT name, kind, file_path, line_start, line_end, signature, + is_exported, parent_name, visibility + FROM symbols + WHERE ${clauses.join(" AND ")} + ORDER BY file_path ASC, line_start ASC`; + return db.query(sql).all(...params) as SymbolMatch[]; +} + +/** + * Escape SQLite LIKE meta-characters (`_`, `%`) and the escape character + * itself so a user-supplied path matches literally. Used with + * `file_path LIKE ? ESCAPE '\'`. + */ +export function escapeLikeLiteral(s: string): string { + return s.replace(/[\\_%]/g, (c) => `\\${c}`); +} + +// Heuristic: `--in src/cli/` (trailing slash) and `--in src/cli` (no slash, no +// dot) both mean "prefix"; `--in src/cli/cmd-show.ts` (has a file extension +// after the last slash) means "exact file match". Conservative: anything +// ambiguous treats as prefix — over-matching is recoverable (agent narrows +// further); under-matching silently misses results. +function looksLikeDirectory(p: string): boolean { + if (p.endsWith("/")) return true; + const lastSlash = p.lastIndexOf("/"); + const tail = lastSlash === -1 ? p : p.slice(lastSlash + 1); + // No `.` in the trailing segment → directory-shaped (e.g. `src/cli`). + // A `.` → file-shaped (e.g. `src/cli/cmd-show.ts`, `cmd-show.ts`). + return !tail.includes("."); +} + +/** + * Result of reading a symbol's source content from disk. `source` is the + * file lines from `match.line_start..match.line_end` joined by newlines. + * `stale` is true when the file's current content_hash differs from + * `match`'s recorded hash (per Q-6 settled — read + flag, no auto-reindex). + * `missing` is true when the file no longer exists on disk. + */ +export interface ReadSourceResult { + source: string | undefined; + stale: boolean; + missing: boolean; +} + +export interface ReadSymbolSourceOpts { + match: SymbolMatch; + projectRoot: string; + /** + * The indexed `content_hash` for `match.file_path` — same value + * `cmd-validate.ts` reads. Pass `undefined` if the caller doesn't want + * stale detection (always returns `stale: false`); pass the value from + * `SELECT content_hash FROM files WHERE path = ?` to enable it. + */ + indexedContentHash?: string | undefined; +} + +/** + * Read a symbol's source text from disk and compare against the indexed + * hash for staleness. Per plan §9 Q-6 (settled): read + flag — agent + * decides whether to act on possibly-shifted line ranges. No auto-reindex + * (read tool, no side-effects); no refusal (data is already on disk). + * + * Same FS-read pattern `cmd-validate.ts` uses — `readFileSync(abs, "utf8")` + * + `hashContent(source) !== indexedHash`. Reuses `hashContent` from + * `src/hash.ts`. Line slicing is 1-indexed inclusive, matching the + * `symbols.line_start` / `line_end` column convention. + */ +export function readSymbolSource(opts: ReadSymbolSourceOpts): ReadSourceResult { + const abs = join(opts.projectRoot, opts.match.file_path); + if (!existsSync(abs)) { + return { source: undefined, stale: true, missing: true }; + } + const content = readFileSync(abs, "utf8"); + const stale = + opts.indexedContentHash !== undefined && + hashContent(content) !== opts.indexedContentHash; + const lines = content.split("\n"); + // line_start / line_end are 1-indexed inclusive in the symbols table; + // slice() is 0-indexed half-open, so subtract 1 from the start and use + // line_end as the exclusive upper bound. + const start = Math.max(0, opts.match.line_start - 1); + const end = Math.min(lines.length, opts.match.line_end); + const source = lines.slice(start, end).join("\n"); + return { source, stale, missing: false }; +} + +/** + * Convenience: look up a file's indexed content_hash (same query + * `cmd-validate.ts` uses). Returns `undefined` for unindexed paths so the + * caller can decide what staleness means in that case. + */ +export function getIndexedContentHash( + db: CodemapDatabase, + filePath: string, +): string | undefined { + const row = db + .query("SELECT content_hash FROM files WHERE path = ?") + .get(filePath) as { content_hash: string } | null; + return row?.content_hash; +} diff --git a/src/cli/bootstrap.ts b/src/cli/bootstrap.ts index df838e6..bd227cc 100644 --- a/src/cli/bootstrap.ts +++ b/src/cli/bootstrap.ts @@ -28,6 +28,10 @@ Agents: MCP server (Model Context Protocol — for agent hosts): codemap mcp # stdio JSON-RPC, one tool per CLI verb +Targeted reads (precise lookup by symbol name): + codemap show [--kind ] [--in ] [--json] # metadata: file:line + signature + codemap snippet [--kind ] [--in ] [--json] # source text from disk + stale flag + Other: codemap version codemap --version, -V @@ -57,6 +61,8 @@ export function validateIndexModeArgs(rest: string[]): void { if (rest[0] === "context") return; if (rest[0] === "audit") return; if (rest[0] === "mcp") return; + if (rest[0] === "show") return; + if (rest[0] === "snippet") return; if (rest[0] === "agents") { if (rest[1] === "init") return; diff --git a/src/cli/cmd-show.test.ts b/src/cli/cmd-show.test.ts new file mode 100644 index 0000000..4b72265 --- /dev/null +++ b/src/cli/cmd-show.test.ts @@ -0,0 +1,130 @@ +import { describe, expect, it } from "bun:test"; + +import type { SymbolMatch } from "../application/show-engine"; +import { buildShowResult, parseShowRest } from "./cmd-show"; + +describe("parseShowRest", () => { + it("returns help on --help / -h", () => { + expect(parseShowRest(["show", "--help"]).kind).toBe("help"); + expect(parseShowRest(["show", "-h"]).kind).toBe("help"); + }); + + it("errors when no given", () => { + const r = parseShowRest(["show"]); + expect(r.kind).toBe("error"); + if (r.kind === "error") expect(r.message).toContain("missing "); + }); + + it("errors on extra positional argument (no fuzzy fallback)", () => { + const r = parseShowRest(["show", "foo", "bar"]); + expect(r.kind).toBe("error"); + if (r.kind === "error") expect(r.message).toContain("unexpected extra"); + }); + + it("errors on unknown flag", () => { + const r = parseShowRest(["show", "foo", "--regex"]); + expect(r.kind).toBe("error"); + if (r.kind === "error") expect(r.message).toContain("--regex"); + }); + + it("errors when --kind has no value", () => { + const r = parseShowRest(["show", "foo", "--kind"]); + expect(r.kind).toBe("error"); + if (r.kind === "error") expect(r.message).toContain("--kind"); + }); + + it("errors when --in has no value", () => { + const r = parseShowRest(["show", "foo", "--in"]); + expect(r.kind).toBe("error"); + if (r.kind === "error") expect(r.message).toContain("--in"); + }); + + it("parses bare name", () => { + const r = parseShowRest(["show", "foo"]); + expect(r).toEqual({ + kind: "run", + name: "foo", + kindFilter: undefined, + inPath: undefined, + json: false, + }); + }); + + it("parses name + flags in any order", () => { + const r = parseShowRest([ + "show", + "--json", + "--kind", + "function", + "foo", + "--in", + "src/cli", + ]); + expect(r).toEqual({ + kind: "run", + name: "foo", + kindFilter: "function", + inPath: "src/cli", + json: true, + }); + }); + + it("throws if rest[0] is not 'show'", () => { + expect(() => parseShowRest(["query"])).toThrow(); + }); +}); + +describe("buildShowResult — disambiguation envelope (Q-2)", () => { + function match( + file: string, + name: string, + kind = "function", + line = 1, + ): SymbolMatch { + return { + name, + kind, + file_path: file, + line_start: line, + line_end: line, + signature: `${kind} ${name}`, + is_exported: 1, + parent_name: null, + visibility: null, + }; + } + + it("single match → no disambiguation block", () => { + const r = buildShowResult([match("src/a.ts", "foo")]); + expect(r.matches).toHaveLength(1); + expect(r.disambiguation).toBeUndefined(); + }); + + it("zero matches → empty matches, no disambiguation", () => { + const r = buildShowResult([]); + expect(r).toEqual({ matches: [] }); + }); + + it("multi-match adds disambiguation with n + by_kind + files + hint", () => { + const r = buildShowResult([ + match("src/a.ts", "foo", "function"), + match("src/b.ts", "foo", "function"), + match("src/c.ts", "foo", "const"), + ]); + expect(r.matches).toHaveLength(3); + expect(r.disambiguation).toEqual({ + n: 3, + by_kind: { function: 2, const: 1 }, + files: ["src/a.ts", "src/b.ts", "src/c.ts"], + hint: "Multiple matches. Narrow with --kind or --in .", + }); + }); + + it("dedupes files in disambiguation.files", () => { + const r = buildShowResult([ + match("src/a.ts", "foo", "function", 5), + match("src/a.ts", "foo", "function", 50), + ]); + expect(r.disambiguation?.files).toEqual(["src/a.ts"]); + }); +}); diff --git a/src/cli/cmd-show.ts b/src/cli/cmd-show.ts new file mode 100644 index 0000000..7c0b4bd --- /dev/null +++ b/src/cli/cmd-show.ts @@ -0,0 +1,250 @@ +import { findSymbolsByName } from "../application/show-engine"; +import type { SymbolMatch } from "../application/show-engine"; +import { loadUserConfig, resolveCodemapConfig } from "../config"; +import { closeDb, openDb } from "../db"; +import { configureResolver } from "../resolver"; +import { getProjectRoot, getTsconfigPath, initCodemap } from "../runtime"; +import { toProjectRelative } from "./cmd-validate"; + +/** + * The catalog envelope returned by `show` — same shape both the CLI's + * `--json` mode and the MCP `show` tool surface (per plan §4 uniformity + * + Q-2 settled). Single match → `{matches: [{...}]}`; multi-match adds + * a structured `disambiguation` block so agents narrow without scanning + * every row. + */ +export interface ShowResult { + matches: SymbolMatch[]; + disambiguation?: { + n: number; + by_kind: Record; + files: string[]; + hint: string; + }; +} + +interface ShowOpts { + root: string; + configFile: string | undefined; + name: string; + kind: string | undefined; + inPath: string | undefined; + json: boolean; +} + +/** + * Print `codemap show` usage. + */ +export function printShowCmdHelp(): void { + console.log(`Usage: codemap show [--kind ] [--in ] [--json] + +Look up symbol(s) by exact name and return file_path:line_start-line_end + +signature. One-step lookup that beats composing +\`SELECT … FROM symbols WHERE name = ?\` by hand. + +Args: + Exact symbol name (case-sensitive). + +Flags: + --kind Filter by symbols.kind (function / class / const / …). + --in Filter by file scope. Trailing slash or no extension + in the trailing segment treats as prefix; otherwise + exact file match. + --json Emit the JSON envelope (always wrapped in {matches}). + --help, -h Show this help. + +Output (JSON, all cases): + { "matches": [ {name, kind, file_path, line_start, line_end, signature, ...}, ... ], + "disambiguation"?: { "n": , "by_kind": {...}, "files": [...], "hint": "..." } } + +Examples: + codemap show runQueryCmd + codemap show foo --kind function + codemap show foo --in src/cli + codemap show runQueryCmd --json +`); +} + +/** + * Parse `argv` after the bootstrap split: `rest[0]` must be `"show"`. + */ +export function parseShowRest(rest: string[]): + | { kind: "help" } + | { kind: "error"; message: string } + | { + kind: "run"; + name: string; + kindFilter: string | undefined; + inPath: string | undefined; + json: boolean; + } { + if (rest[0] !== "show") { + throw new Error("parseShowRest: expected show"); + } + + let json = false; + let name: string | undefined; + let kindFilter: string | undefined; + let inPath: string | undefined; + + for (let i = 1; i < rest.length; i++) { + const a = rest[i]!; + if (a === "--help" || a === "-h") return { kind: "help" }; + if (a === "--json") { + json = true; + continue; + } + if (a === "--kind") { + const next = rest[i + 1]; + if (next === undefined || next.startsWith("-")) { + return { + kind: "error", + message: `codemap show: "--kind" requires a value.`, + }; + } + kindFilter = next; + i++; + continue; + } + if (a === "--in") { + const next = rest[i + 1]; + if (next === undefined || next.startsWith("-")) { + return { + kind: "error", + message: `codemap show: "--in" requires a value.`, + }; + } + inPath = next; + i++; + continue; + } + if (a.startsWith("-")) { + return { + kind: "error", + message: `codemap show: unknown option "${a}". Run \`codemap show --help\` for usage.`, + }; + } + if (name !== undefined) { + return { + kind: "error", + message: `codemap show: unexpected extra argument "${a}". Pass exactly one symbol name.`, + }; + } + name = a; + } + + if (name === undefined) { + return { + kind: "error", + message: `codemap show: missing . Run \`codemap show --help\` for usage.`, + }; + } + + return { kind: "run", name, kindFilter, inPath, json }; +} + +/** + * Build the `ShowResult` envelope from a list of matches. Single-match + * → `{matches}` only. Multi-match → adds a `disambiguation` block with + * structured aids so agents narrow without scanning every row. + */ +export function buildShowResult(matches: SymbolMatch[]): ShowResult { + if (matches.length <= 1) return { matches }; + const byKind: Record = {}; + for (const m of matches) byKind[m.kind] = (byKind[m.kind] ?? 0) + 1; + const files = Array.from(new Set(matches.map((m) => m.file_path))).sort(); + return { + matches, + disambiguation: { + n: matches.length, + by_kind: byKind, + files, + hint: "Multiple matches. Narrow with --kind or --in .", + }, + }; +} + +/** + * Run `codemap show `. Bootstraps codemap, opens db, looks up, + * renders. Sets `process.exitCode` (no `process.exit`) so piped stdout + * isn't truncated. Errors emit the `{"error":"…"}` envelope on stdout + * under `--json`, plain message on stderr otherwise. + */ +export async function runShowCmd(opts: ShowOpts): Promise { + try { + const user = await loadUserConfig(opts.root, opts.configFile); + initCodemap(resolveCodemapConfig(opts.root, user)); + configureResolver(getProjectRoot(), getTsconfigPath()); + + const projectRoot = getProjectRoot(); + const inPath = + opts.inPath !== undefined + ? toProjectRelative(projectRoot, opts.inPath) + : undefined; + + const db = openDb(); + let matches: SymbolMatch[]; + try { + matches = findSymbolsByName(db, { + name: opts.name, + kind: opts.kind, + inPath, + }); + } finally { + closeDb(db, { readonly: true }); + } + + if (matches.length === 0) { + const filterDesc = describeFilter(opts.kind, inPath); + // SQLite single-quote escape (`''`) — keeps the suggested SQL valid + // when name contains apostrophes (e.g. `O'Brien`). + const safeName = opts.name.replace(/'/g, "''"); + const message = `codemap show: no symbol named "${opts.name}"${filterDesc}. Try \`codemap query --json "SELECT name, file_path FROM symbols WHERE name LIKE '%${safeName}%'"\` for fuzzy lookup.`; + emitErrorMaybeJson(message, opts.json); + return; + } + + const result = buildShowResult(matches); + if (opts.json) { + console.log(JSON.stringify(result)); + return; + } + renderTerminal(result); + } catch (err) { + const msg = err instanceof Error ? err.message : String(err); + emitErrorMaybeJson(msg, opts.json); + } +} + +function describeFilter( + kind: string | undefined, + inPath: string | undefined, +): string { + const parts: string[] = []; + if (kind !== undefined) parts.push(`kind = "${kind}"`); + if (inPath !== undefined) parts.push(`in = "${inPath}"`); + return parts.length === 0 ? "" : ` (filters: ${parts.join(", ")})`; +} + +function renderTerminal(result: ShowResult): void { + for (let i = 0; i < result.matches.length; i++) { + const m = result.matches[i]!; + if (i > 0) console.log(""); + console.log(`${m.file_path}:${m.line_start}-${m.line_end}`); + console.log(` ${m.signature}`); + } + if (result.disambiguation !== undefined) { + console.error( + `\n# ${result.disambiguation.n} matches — ${result.disambiguation.hint}`, + ); + } +} + +function emitErrorMaybeJson(message: string, json: boolean): void { + if (json) { + console.log(JSON.stringify({ error: message })); + } else { + console.error(message); + } + process.exitCode = 1; +} diff --git a/src/cli/cmd-snippet.test.ts b/src/cli/cmd-snippet.test.ts new file mode 100644 index 0000000..48e7674 --- /dev/null +++ b/src/cli/cmd-snippet.test.ts @@ -0,0 +1,163 @@ +import { afterEach, beforeEach, describe, expect, it } from "bun:test"; +import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +import { createTables } from "../db"; +import type { CodemapDatabase } from "../db"; +import { hashContent } from "../hash"; +import { openCodemapDatabase } from "../sqlite-db"; +import { buildSnippetResult, parseSnippetRest } from "./cmd-snippet"; + +describe("parseSnippetRest", () => { + it("returns help on --help / -h", () => { + expect(parseSnippetRest(["snippet", "--help"]).kind).toBe("help"); + expect(parseSnippetRest(["snippet", "-h"]).kind).toBe("help"); + }); + + it("errors when no given", () => { + const r = parseSnippetRest(["snippet"]); + expect(r.kind).toBe("error"); + if (r.kind === "error") expect(r.message).toContain("missing "); + }); + + it("errors on extra positional argument", () => { + const r = parseSnippetRest(["snippet", "foo", "bar"]); + expect(r.kind).toBe("error"); + if (r.kind === "error") expect(r.message).toContain("unexpected extra"); + }); + + it("errors on unknown flag", () => { + const r = parseSnippetRest(["snippet", "foo", "--with-context"]); + expect(r.kind).toBe("error"); + if (r.kind === "error") expect(r.message).toContain("--with-context"); + }); + + it("parses bare name", () => { + const r = parseSnippetRest(["snippet", "foo"]); + expect(r).toEqual({ + kind: "run", + name: "foo", + kindFilter: undefined, + inPath: undefined, + json: false, + }); + }); + + it("parses name + flags in any order", () => { + const r = parseSnippetRest([ + "snippet", + "--json", + "--kind", + "function", + "foo", + "--in", + "src/cli", + ]); + expect(r).toEqual({ + kind: "run", + name: "foo", + kindFilter: "function", + inPath: "src/cli", + json: true, + }); + }); + + it("throws if rest[0] is not 'snippet'", () => { + expect(() => parseSnippetRest(["query"])).toThrow(); + }); +}); + +describe("buildSnippetResult — source enrichment + envelope", () => { + let projectRoot: string; + let db: CodemapDatabase; + + beforeEach(() => { + projectRoot = mkdtempSync(join(tmpdir(), "snippet-test-")); + mkdirSync(join(projectRoot, "src"), { recursive: true }); + db = openCodemapDatabase(":memory:"); + createTables(db); + }); + + afterEach(() => { + rmSync(projectRoot, { recursive: true, force: true }); + db.close(); + }); + + function seed( + file: string, + content: string, + name: string, + lineRange: [number, number], + ) { + writeFileSync(join(projectRoot, file), content); + db.run( + "INSERT INTO files (path, content_hash, size, line_count, language, last_modified, indexed_at) VALUES (?, ?, ?, ?, ?, ?, ?)", + [ + file, + hashContent(content), + content.length, + content.split("\n").length, + "ts", + 1, + 1, + ], + ); + db.run( + `INSERT INTO symbols (file_path, name, kind, line_start, line_end, signature, is_exported, is_default_export) + VALUES (?, ?, 'function', ?, ?, ?, 1, 0)`, + [file, name, lineRange[0], lineRange[1], `function ${name}(): void`], + ); + } + + it("single match returns {matches} with source filled, no disambiguation", () => { + seed("src/a.ts", "line 1\nline 2\nline 3\nline 4\n", "foo", [2, 3]); + const matches = db + .query("SELECT * FROM symbols WHERE name = ?") + .all("foo") as never; + const r = buildSnippetResult({ db, matches, projectRoot }); + expect(r.matches).toHaveLength(1); + expect(r.matches[0]!.source).toBe("line 2\nline 3"); + expect(r.matches[0]!.stale).toBe(false); + expect(r.matches[0]!.missing).toBe(false); + expect(r.disambiguation).toBeUndefined(); + }); + + it("flags stale: true when on-disk content drifts from indexed hash", () => { + seed("src/b.ts", "old\nold line 2\n", "bar", [1, 2]); + // Mutate the file after indexing. + writeFileSync(join(projectRoot, "src/b.ts"), "new\ntotally different\n"); + const matches = db + .query("SELECT * FROM symbols WHERE name = ?") + .all("bar") as never; + const r = buildSnippetResult({ db, matches, projectRoot }); + expect(r.matches[0]!.stale).toBe(true); + expect(r.matches[0]!.source).toBe("new\ntotally different"); + }); + + it("flags missing: true when file no longer exists on disk", () => { + seed("src/c.ts", "x\n", "baz", [1, 1]); + rmSync(join(projectRoot, "src/c.ts")); + const matches = db + .query("SELECT * FROM symbols WHERE name = ?") + .all("baz") as never; + const r = buildSnippetResult({ db, matches, projectRoot }); + expect(r.matches[0]!.missing).toBe(true); + expect(r.matches[0]!.source).toBeUndefined(); + }); + + it("multi-match adds disambiguation envelope", () => { + seed("src/a.ts", "ok\n", "shared", [1, 1]); + seed("src/b.ts", "ok\n", "shared", [1, 1]); + const matches = db + .query("SELECT * FROM symbols WHERE name = ? ORDER BY file_path") + .all("shared") as never; + const r = buildSnippetResult({ db, matches, projectRoot }); + expect(r.matches).toHaveLength(2); + expect(r.disambiguation).toMatchObject({ + n: 2, + by_kind: { function: 2 }, + files: ["src/a.ts", "src/b.ts"], + }); + }); +}); diff --git a/src/cli/cmd-snippet.ts b/src/cli/cmd-snippet.ts new file mode 100644 index 0000000..3d3ec6b --- /dev/null +++ b/src/cli/cmd-snippet.ts @@ -0,0 +1,308 @@ +import { + findSymbolsByName, + getIndexedContentHash, + readSymbolSource, +} from "../application/show-engine"; +import type { SymbolMatch } from "../application/show-engine"; +import { loadUserConfig, resolveCodemapConfig } from "../config"; +import { closeDb, openDb } from "../db"; +import type { CodemapDatabase } from "../db"; +import { configureResolver } from "../resolver"; +import { getProjectRoot, getTsconfigPath, initCodemap } from "../runtime"; +import { toProjectRelative } from "./cmd-validate"; + +/** + * Per-match payload returned by `snippet` — extends the `show` row shape + * with the source text and stale-flag fields. Same row shape as + * `findSymbolsByName` returns plus three additive fields: + * `source` (the file lines from line_start..line_end), + * `stale` (true when the file's content_hash drifted since indexing), + * `missing` (true when the file no longer exists on disk). + */ +export interface SnippetMatch extends SymbolMatch { + source: string | undefined; + stale: boolean; + missing: boolean; +} + +/** + * The catalog envelope returned by `snippet` — same shape as `show`'s + * `ShowResult` (per Q-2 + Q-5: snippet adds source/stale/missing on each + * row but keeps the {matches, disambiguation?} envelope). Single match + * → `{matches: [{...}]}`; multi-match adds the structured disambiguation + * block. + */ +export interface SnippetResult { + matches: SnippetMatch[]; + disambiguation?: { + n: number; + by_kind: Record; + files: string[]; + hint: string; + }; +} + +interface SnippetOpts { + root: string; + configFile: string | undefined; + name: string; + kind: string | undefined; + inPath: string | undefined; + json: boolean; +} + +/** + * Print `codemap snippet` usage. + */ +export function printSnippetCmdHelp(): void { + console.log(`Usage: codemap snippet [--kind ] [--in ] [--json] + +Look up symbol(s) by exact name and return the source text from disk +(plus the same metadata \`codemap show\` returns). Same lookup semantics +as \`show\`; difference is the response carries the actual code body +sliced from disk at line_start..line_end. + +Args: + Exact symbol name (case-sensitive). + +Flags: + --kind Filter by symbols.kind (function / class / const / …). + --in Filter by file scope. Trailing slash or no extension + in the trailing segment treats as prefix; otherwise + exact file match. + --json Emit the JSON envelope (always wrapped in {matches}). + --help, -h Show this help. + +Output (JSON, all cases): + { "matches": [ {name, kind, file_path, line_start, line_end, signature, + source, stale, missing, ...}, ... ], + "disambiguation"?: { "n": , "by_kind": {...}, "files": [...], "hint": "..." } } + +Stale-file behavior: if the file's content hash drifted since the last +index run, the row carries \`stale: true\` and the source is still +returned (read from disk). If the file is missing on disk, the row +carries \`missing: true\` and source is null. The agent decides whether +to act on stale content or re-index first. + +Examples: + codemap snippet runQueryCmd + codemap snippet foo --kind function + codemap snippet runQueryCmd --json +`); +} + +/** + * Parse `argv` after the bootstrap split: `rest[0]` must be `"snippet"`. + * Same shape as `parseShowRest` — same flag set + same error UX. + */ +export function parseSnippetRest(rest: string[]): + | { kind: "help" } + | { kind: "error"; message: string } + | { + kind: "run"; + name: string; + kindFilter: string | undefined; + inPath: string | undefined; + json: boolean; + } { + if (rest[0] !== "snippet") { + throw new Error("parseSnippetRest: expected snippet"); + } + + let json = false; + let name: string | undefined; + let kindFilter: string | undefined; + let inPath: string | undefined; + + for (let i = 1; i < rest.length; i++) { + const a = rest[i]!; + if (a === "--help" || a === "-h") return { kind: "help" }; + if (a === "--json") { + json = true; + continue; + } + if (a === "--kind") { + const next = rest[i + 1]; + if (next === undefined || next.startsWith("-")) { + return { + kind: "error", + message: `codemap snippet: "--kind" requires a value.`, + }; + } + kindFilter = next; + i++; + continue; + } + if (a === "--in") { + const next = rest[i + 1]; + if (next === undefined || next.startsWith("-")) { + return { + kind: "error", + message: `codemap snippet: "--in" requires a value.`, + }; + } + inPath = next; + i++; + continue; + } + if (a.startsWith("-")) { + return { + kind: "error", + message: `codemap snippet: unknown option "${a}". Run \`codemap snippet --help\` for usage.`, + }; + } + if (name !== undefined) { + return { + kind: "error", + message: `codemap snippet: unexpected extra argument "${a}". Pass exactly one symbol name.`, + }; + } + name = a; + } + + if (name === undefined) { + return { + kind: "error", + message: `codemap snippet: missing . Run \`codemap snippet --help\` for usage.`, + }; + } + + return { kind: "run", name, kindFilter, inPath, json }; +} + +/** + * Build the `SnippetResult` envelope from matches + per-match source reads. + * Mirrors `buildShowResult` from `cmd-show.ts` but enriches each match with + * `source` / `stale` / `missing` fields read fresh from disk per Q-6 + * (read + flag, no auto-reindex). + */ +export function buildSnippetResult(opts: { + db: CodemapDatabase; + matches: SymbolMatch[]; + projectRoot: string; +}): SnippetResult { + const enriched: SnippetMatch[] = opts.matches.map((m) => { + const indexedHash = getIndexedContentHash(opts.db, m.file_path); + const read = readSymbolSource({ + match: m, + projectRoot: opts.projectRoot, + indexedContentHash: indexedHash, + }); + return { + ...m, + source: read.source, + stale: read.stale, + missing: read.missing, + }; + }); + + if (enriched.length <= 1) return { matches: enriched }; + const byKind: Record = {}; + for (const m of enriched) byKind[m.kind] = (byKind[m.kind] ?? 0) + 1; + const files = Array.from(new Set(enriched.map((m) => m.file_path))).sort(); + return { + matches: enriched, + disambiguation: { + n: enriched.length, + by_kind: byKind, + files, + hint: "Multiple matches. Narrow with --kind or --in .", + }, + }; +} + +/** + * Run `codemap snippet `. Mirrors `runShowCmd`'s shape — bootstrap, + * lookup, render. JSON mode prints the envelope verbatim; terminal mode + * prints `path:line-line` + signature + source per row, with a stderr + * staleness hint when any row is stale. + */ +export async function runSnippetCmd(opts: SnippetOpts): Promise { + try { + const user = await loadUserConfig(opts.root, opts.configFile); + initCodemap(resolveCodemapConfig(opts.root, user)); + configureResolver(getProjectRoot(), getTsconfigPath()); + + const projectRoot = getProjectRoot(); + const inPath = + opts.inPath !== undefined + ? toProjectRelative(projectRoot, opts.inPath) + : undefined; + + const db = openDb(); + let matches: SymbolMatch[]; + let result: SnippetResult; + try { + matches = findSymbolsByName(db, { + name: opts.name, + kind: opts.kind, + inPath, + }); + if (matches.length === 0) { + const filterDesc = describeFilter(opts.kind, inPath); + // SQLite single-quote escape (`''`) — keeps the suggested SQL valid + // when name contains apostrophes (e.g. `O'Brien`). + const safeName = opts.name.replace(/'/g, "''"); + const message = `codemap snippet: no symbol named "${opts.name}"${filterDesc}. Try \`codemap query --json "SELECT name, file_path FROM symbols WHERE name LIKE '%${safeName}%'"\` for fuzzy lookup.`; + emitErrorMaybeJson(message, opts.json); + return; + } + result = buildSnippetResult({ db, matches, projectRoot }); + } finally { + closeDb(db, { readonly: true }); + } + + if (opts.json) { + console.log(JSON.stringify(result)); + return; + } + renderTerminal(result); + } catch (err) { + const msg = err instanceof Error ? err.message : String(err); + emitErrorMaybeJson(msg, opts.json); + } +} + +function describeFilter( + kind: string | undefined, + inPath: string | undefined, +): string { + const parts: string[] = []; + if (kind !== undefined) parts.push(`kind = "${kind}"`); + if (inPath !== undefined) parts.push(`in = "${inPath}"`); + return parts.length === 0 ? "" : ` (filters: ${parts.join(", ")})`; +} + +function renderTerminal(result: SnippetResult): void { + let anyStale = false; + for (let i = 0; i < result.matches.length; i++) { + const m = result.matches[i]!; + if (i > 0) console.log(""); + const stalePrefix = m.stale ? " [STALE]" : ""; + const missingPrefix = m.missing ? " [MISSING]" : ""; + console.log( + `${m.file_path}:${m.line_start}-${m.line_end}${stalePrefix}${missingPrefix}`, + ); + if (m.source !== undefined) console.log(m.source); + if (m.stale) anyStale = true; + } + if (result.disambiguation !== undefined) { + console.error( + `\n# ${result.disambiguation.n} matches — ${result.disambiguation.hint}`, + ); + } + if (anyStale) { + console.error( + `\n# Some snippets are stale (file changed since last index). Run \`codemap\` or \`codemap --files \` to refresh.`, + ); + } +} + +function emitErrorMaybeJson(message: string, json: boolean): void { + if (json) { + console.log(JSON.stringify({ error: message })); + } else { + console.error(message); + } + process.exitCode = 1; +} diff --git a/src/cli/cmd-validate.ts b/src/cli/cmd-validate.ts index 4a87a03..c069afb 100644 --- a/src/cli/cmd-validate.ts +++ b/src/cli/cmd-validate.ts @@ -146,7 +146,7 @@ export function computeValidateRows( * slashes (tinyglobby / Bun.Glob / git diff all emit POSIX), so we normalize * here to make `indexByPath.get(rel)` succeed cross-platform. */ -function toProjectRelative(projectRoot: string, p: string): string { +export function toProjectRelative(projectRoot: string, p: string): string { const rel = isAbsolute(p) ? relative(projectRoot, p) : p; return sep === "/" ? rel : rel.split(sep).join("/"); } diff --git a/src/cli/main.ts b/src/cli/main.ts index 82f832b..c47d473 100644 --- a/src/cli/main.ts +++ b/src/cli/main.ts @@ -108,6 +108,52 @@ Copies bundled agent templates into .agents/ under the project root. return; } + if (rest[0] === "show") { + const { parseShowRest, printShowCmdHelp, runShowCmd } = + await import("./cmd-show.js"); + const parsed = parseShowRest(rest); + if (parsed.kind === "help") { + printShowCmdHelp(); + return; + } + if (parsed.kind === "error") { + console.error(parsed.message); + process.exit(1); + } + await runShowCmd({ + root, + configFile, + name: parsed.name, + kind: parsed.kindFilter, + inPath: parsed.inPath, + json: parsed.json, + }); + return; + } + + if (rest[0] === "snippet") { + const { parseSnippetRest, printSnippetCmdHelp, runSnippetCmd } = + await import("./cmd-snippet.js"); + const parsed = parseSnippetRest(rest); + if (parsed.kind === "help") { + printSnippetCmdHelp(); + return; + } + if (parsed.kind === "error") { + console.error(parsed.message); + process.exit(1); + } + await runSnippetCmd({ + root, + configFile, + name: parsed.name, + kind: parsed.kindFilter, + inPath: parsed.inPath, + json: parsed.json, + }); + return; + } + if (rest[0] === "mcp") { const { parseMcpRest, printMcpCmdHelp, runMcpCmd } = await import("./cmd-mcp.js"); diff --git a/templates/agents/rules/codemap.md b/templates/agents/rules/codemap.md index 9d5733e..0d62cc0 100644 --- a/templates/agents/rules/codemap.md +++ b/templates/agents/rules/codemap.md @@ -33,6 +33,8 @@ Install **[@stainless-code/codemap](https://www.npmjs.com/package/@stainless-cod | List / drop baselines | `codemap query --baselines` · `codemap query --drop-baseline ` | | Per-delta audit | `codemap audit --json --baseline base` (auto-resolves `base-files` / `base-dependencies` / `base-deprecated`) | | MCP server (for agent hosts) | `codemap mcp` — JSON-RPC on stdio; one tool per CLI verb. See **MCP** section below. | +| Targeted read (metadata) | `codemap show [--kind ] [--in ] [--json]` — file:line + signature | +| Targeted read (source text) | `codemap snippet [--kind ] [--in ] [--json]` — same lookup + source from disk + stale flag | **Recipe `actions`:** with **`--json`**, recipes that define an `actions` template append it to every row (kebab-case verb + description — e.g. `fan-out` → `review-coupling`). Under `--baseline`, actions attach to the **`added`** rows only. Inspect via **`--recipes-json`**. Ad-hoc SQL never carries actions. @@ -55,9 +57,11 @@ Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/ **Audit (`codemap audit`)**: structural-drift command; emits `{head, deltas: {files, dependencies, deprecated}}` (each delta carries its own `base` metadata). Reuses B.6 baselines as the snapshot source. Two CLI shapes — `--baseline ` auto-resolves `-files` / `-dependencies` / `-deprecated`; `---baseline ` is the explicit per-delta override. v1 ships no `verdict` / threshold config — consumers compose `--json` + `jq` for CI exit codes. Auto-runs an incremental index before the diff (use `--no-index` to skip for frozen-DB CI). +**Targeted reads (`show` / `snippet`)**: precise lookup by exact symbol name without composing SQL. `show` returns metadata (`file_path:line_start-line_end` + `signature`); `snippet` returns the source text from disk plus `stale` / `missing` flags. Both share the same flag set (`--kind ` to filter by `symbols.kind`, `--in ` for file-scope filter — directory prefix or exact file). Output envelope is `{matches, disambiguation?}` — single match → `{matches: [{...}]}`; multi-match adds `disambiguation: {n, by_kind, files, hint}` so agents narrow without re-scanning. Name match is exact / case-sensitive — for fuzzy use `query` with `LIKE '%name%'`. Snippet stale-file behavior: `source` is always returned when the file exists; `stale: true` means the line range may have shifted (re-index with `codemap` or `codemap --files ` before acting on the source). + **MCP server (`codemap mcp`)**: stdio MCP (Model Context Protocol) server — agents call codemap as JSON-RPC tools instead of shelling out to the CLI on every read. v1 ships one tool per CLI verb plus four lazy-cached resources: -- **Tools:** `query` / `query_batch` / `query_recipe` / `audit` / `save_baseline` / `list_baselines` / `drop_baseline` / `context` / `validate`. Snake_case keys (Codemap convention matching MCP spec examples + reference servers — spec is convention-agnostic; CLI stays kebab). +- **Tools:** `query` / `query_batch` / `query_recipe` / `audit` / `save_baseline` / `list_baselines` / `drop_baseline` / `context` / `validate` / `show` / `snippet`. Snake_case keys (Codemap convention matching MCP spec examples + reference servers — spec is convention-agnostic; CLI stays kebab). - **`query_batch` (MCP-only):** N statements in one round-trip. Items are `string | {sql, summary?, changed_since?, group_by?}` — string form inherits batch-wide flag defaults, object form overrides on a per-key basis. Per-statement errors are isolated. - **`save_baseline` (polymorphic):** one tool, `{name, sql? | recipe?}` with runtime exclusivity check (mirrors the CLI's single `--save-baseline=` verb). - **Resources:** `codemap://recipes` (catalog), `codemap://recipes/{id}` (one recipe), `codemap://schema` (live DDL from `sqlite_schema`), `codemap://skill` (bundled SKILL.md text). Lazy-cached on first `read_resource`. diff --git a/templates/agents/skills/codemap/SKILL.md b/templates/agents/skills/codemap/SKILL.md index 60de587..f021a73 100644 --- a/templates/agents/skills/codemap/SKILL.md +++ b/templates/agents/skills/codemap/SKILL.md @@ -67,6 +67,8 @@ Each emitted delta carries its own `base` metadata so mixed-baseline audits are - **`drop_baseline`** — `{name}`. Returns `{dropped: }` on success or `isError` if the name doesn't exist. - **`context`** — `{compact?, intent?}`. Returns the project-bootstrap envelope (codemap version, schema version, file count, language breakdown, hubs, sample markers). Designed for agent session-start — one call replaces 4-5 `query` calls. - **`validate`** — `{paths?: string[]}`. Compares on-disk SHA-256 to indexed `files.content_hash`; empty `paths` validates everything. Returns rows with status (`ok`/`stale`/`missing`/`unindexed`). +- **`show`** — `{name, kind?, in?}`. Exact, case-sensitive symbol name lookup. Returns `{matches: [{name, kind, file_path, line_start, line_end, signature, ...}], disambiguation?: {n, by_kind, files, hint}}`. Single match → `{matches: [{...}]}`; multi-match adds the disambiguation envelope so you narrow without re-scanning. Fuzzy lookup belongs in `query` with `LIKE`. +- **`snippet`** — `{name, kind?, in?}`. Same lookup as `show` but each match also carries `source` (file lines from disk at `line_start..line_end`), `stale` (true when content_hash drifted since indexing — line range may have shifted), `missing` (true when file is gone). `source` is always returned when the file exists; agent decides whether to act on stale content or run `codemap` / `codemap --files ` to re-index first. No auto-reindex side-effects from this read tool. **Resources (lazy-cached on first `read_resource`; constant for server-process lifetime):**