diff --git a/.agents/rules/codemap.md b/.agents/rules/codemap.md index fb1d42a..ff4ba53 100644 --- a/.agents/rules/codemap.md +++ b/.agents/rules/codemap.md @@ -96,30 +96,32 @@ Violating this order is wrong even if you get the right answer — it wastes tim If the question looks like any of these → use the index: -| Question shape | Table(s) | -| ------------------------------------------------------------- | -------------------------------------------------------- | -| "What/which files import X?" | `imports` (by `source`) or `dependencies` (by `to_path`) | -| "Where is X defined?" | `symbols` | -| "What does file X export?" | `exports` | -| "What hooks does component X use?" / "List React components" | `components` | -| "What are the CSS variables/tokens for X?" | `css_variables` | -| "Find all TODOs/FIXMEs" | `markers` | -| "Who depends on file X?" / "What does file X depend on?" | `dependencies` | -| "How many files/symbols/components are there?" | any table with `COUNT(*)` | -| "What are the CSS classes in X?" | `css_classes` | -| "What keyframe animations exist?" | `css_keyframes` | -| "What fields does interface/type X have?" | `type_members` | -| "Is symbol X deprecated?" / "What does X do?" | `symbols` (`doc_comment`) | -| "What's `@internal` / `@beta` / `@alpha` / `@private`?" | `symbols.visibility` (parsed JSDoc tag — not regex) | -| "Who calls X?" / "What does X call?" | `calls` | -| "Is symbol X tested?" / "What's the coverage of file Y?" | `coverage` (after `ingest-coverage`) | -| "What's structurally dead AND untested?" | `--recipe untested-and-dead` | -| "Rank files by test coverage" | `--recipe files-by-coverage` | -| "Worst-covered exported functions" | `--recipe worst-covered-exports` | -| "Which components touch deprecated APIs?" | `--recipe components-touching-deprecated` | -| "What's risky to refactor right now?" | `--recipe refactor-risk-ranking` | -| "Which exports has nobody imported?" | `--recipe unimported-exports` | -| "Find @deprecated functions with TODO/FIXME and low coverage" | `--recipe text-in-deprecated-functions` (needs FTS5 on) | +| Question shape | Table(s) | +| ------------------------------------------------------------- | --------------------------------------------------------- | +| "What/which files import X?" | `imports` (by `source`) or `dependencies` (by `to_path`) | +| "Where is X defined?" | `symbols` | +| "What does file X export?" | `exports` | +| "What hooks does component X use?" / "List React components" | `components` | +| "What are the CSS variables/tokens for X?" | `css_variables` | +| "Find all TODOs/FIXMEs" | `markers` | +| "Who depends on file X?" / "What does file X depend on?" | `dependencies` | +| "How many files/symbols/components are there?" | any table with `COUNT(*)` | +| "What are the CSS classes in X?" | `css_classes` | +| "What keyframe animations exist?" | `css_keyframes` | +| "What fields does interface/type X have?" | `type_members` | +| "Is symbol X deprecated?" / "What does X do?" | `symbols` (`doc_comment`) | +| "What's `@internal` / `@beta` / `@alpha` / `@private`?" | `symbols.visibility` (parsed JSDoc tag — not regex) | +| "Who calls X?" / "What does X call?" | `calls` | +| "Is symbol X tested?" / "What's the coverage of file Y?" | `coverage` (after `ingest-coverage`) | +| "What's structurally dead AND untested?" | `--recipe untested-and-dead` | +| "Rank files by test coverage" | `--recipe files-by-coverage` | +| "Worst-covered exported functions" | `--recipe worst-covered-exports` | +| "Which components touch deprecated APIs?" | `--recipe components-touching-deprecated` | +| "What's risky to refactor right now?" | `--recipe refactor-risk-ranking` | +| "Which exports has nobody imported?" | `--recipe unimported-exports` | +| "Find @deprecated functions with TODO/FIXME and low coverage" | `--recipe text-in-deprecated-functions` (needs FTS5 on) | +| "What's high-complexity AND undertested?" | `--recipe high-complexity-untested` | +| "What's the cyclomatic complexity of symbol X?" | `SELECT name, complexity FROM symbols WHERE name = '...'` | ## When Grep / Read IS appropriate diff --git a/.agents/skills/codemap/SKILL.md b/.agents/skills/codemap/SKILL.md index 3ac517a..cc9e084 100644 --- a/.agents/skills/codemap/SKILL.md +++ b/.agents/skills/codemap/SKILL.md @@ -34,7 +34,7 @@ After **`bun run build`**, **`node dist/index.mjs query …`** or a linked **`co Replace placeholders (`'...'`) with your module path, file glob, or symbol name. -**CLI shortcuts:** **`bun src/index.ts query --json --recipe `** runs bundled SQL (preferred for agents). **`bun src/index.ts query --recipe `** without **`--json`** prints a table. **`bun src/index.ts query --recipes-json`** prints every bundled recipe (**`id`**, **`description`**, **`sql`**, optional **`actions`**) as JSON (no index / DB required). **`bun src/index.ts query --print-sql `** prints one recipe’s SQL only. Ids include **`fan-out`**, **`fan-out-sample`** (**`GROUP_CONCAT`** samples), **`fan-out-sample-json`** (same, but **`json_group_array`** — needs SQLite JSON1), **`fan-in`**, **`index-summary`**, **`files-largest`**, **`components-by-hooks`**, **`components-touching-deprecated`** (UNION of hook + call paths to `@deprecated` symbols), **`markers-by-kind`**, **`deprecated-symbols`**, **`refactor-risk-ranking`** (per-file `(fan_in + 1) × (100 - avg_coverage_pct)`), **`text-in-deprecated-functions`** (FTS5 ⨯ symbols ⨯ coverage demo — needs `--with-fts` enabled), **`unimported-exports`** (exports with no detectable importer; v1 doesn't follow re-export chains — see recipe `.md` for caveats), **`visibility-tags`**, **`barrel-files`**, **`files-hashes`** — see **`bun src/index.ts query --help`**. +**CLI shortcuts:** **`bun src/index.ts query --json --recipe `** runs bundled SQL (preferred for agents). **`bun src/index.ts query --recipe `** without **`--json`** prints a table. **`bun src/index.ts query --recipes-json`** prints every bundled recipe (**`id`**, **`description`**, **`sql`**, optional **`actions`**) as JSON (no index / DB required). **`bun src/index.ts query --print-sql `** prints one recipe’s SQL only. Ids include **`fan-out`**, **`fan-out-sample`** (**`GROUP_CONCAT`** samples), **`fan-out-sample-json`** (same, but **`json_group_array`** — needs SQLite JSON1), **`fan-in`**, **`index-summary`**, **`files-largest`**, **`components-by-hooks`**, **`components-touching-deprecated`** (UNION of hook + call paths to `@deprecated` symbols), **`markers-by-kind`**, **`deprecated-symbols`**, **`refactor-risk-ranking`** (per-file `(fan_in + 1) × (100 - avg_coverage_pct)`), **`high-complexity-untested`** (cyclomatic complexity ≥ 10 + coverage < 50%; per-function), **`text-in-deprecated-functions`** (FTS5 ⨯ symbols ⨯ coverage demo — needs `--with-fts` enabled), **`unimported-exports`** (exports with no detectable importer; v1 doesn't follow re-export chains — see recipe `.md` for caveats), **`visibility-tags`**, **`barrel-files`**, **`files-hashes`**, **`untested-and-dead`** (exported AND uncalled AND uncovered), **`files-by-coverage`** (per-file rollup of statement coverage), **`worst-covered-exports`** (lowest-covered exported symbols) — see **`bun src/index.ts query --help`**. **Output flags** (compose with **`--recipe`** or ad-hoc SQL): @@ -132,22 +132,23 @@ LIMIT 10 ### `symbols` — Functions, types, interfaces, enums, constants, classes -| Column | Type | Description | -| ----------------- | ---------- | ---------------------------------------------------------------------------------------------------------------------------------- | -| id | INTEGER PK | Auto-increment ID | -| file_path | TEXT FK | References `files(path)` | -| name | TEXT | Symbol name | -| kind | TEXT | `function`, `class`, `type`, `interface`, `enum`, `const` | -| line_start | INTEGER | Start line (1-based) | -| line_end | INTEGER | End line (1-based) | -| signature | TEXT | Reconstructed signature with generics and return types | -| is_exported | INTEGER | 1 if exported | -| is_default_export | INTEGER | 1 if default export | -| members | TEXT | JSON enum members (NULL for non-enums) | -| doc_comment | TEXT | Leading JSDoc text (cleaned), NULL when absent | -| value | TEXT | Literal value for consts (`"ok"`, `42`, `true`, `null`) | -| parent_name | TEXT | Enclosing symbol name (class/function), NULL = top-level | -| visibility | TEXT | Line-leading JSDoc tag: `public` / `private` / `internal` / `alpha` / `beta`; NULL when absent. First match in document order wins | +| Column | Type | Description | +| ----------------- | ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --- | -------------------- | +| id | INTEGER PK | Auto-increment ID | +| file_path | TEXT FK | References `files(path)` | +| name | TEXT | Symbol name | +| kind | TEXT | `function`, `class`, `type`, `interface`, `enum`, `const` | +| line_start | INTEGER | Start line (1-based) | +| line_end | INTEGER | End line (1-based) | +| signature | TEXT | Reconstructed signature with generics and return types | +| is_exported | INTEGER | 1 if exported | +| is_default_export | INTEGER | 1 if default export | +| members | TEXT | JSON enum members (NULL for non-enums) | +| doc_comment | TEXT | Leading JSDoc text (cleaned), NULL when absent | +| value | TEXT | Literal value for consts (`"ok"`, `42`, `true`, `null`) | +| parent_name | TEXT | Enclosing symbol name (class/function), NULL = top-level | +| visibility | TEXT | Line-leading JSDoc tag: `public` / `private` / `internal` / `alpha` / `beta`; NULL when absent. First match in document order wins | +| complexity | REAL | Cyclomatic complexity (`1 + decision points`) for function-shaped symbols. NULL for non-functions and class methods (v1). Powers `--recipe high-complexity-untested`. Decision points: `if`, `while`, `do…while`, `for`/`for-in`/`for-of`, `case X:` (not `default:`), `&&`/` | | `/`??`/`?:`, `catch` | ### `calls` — Function-scoped call edges (deduped per file) diff --git a/.changeset/cyclomatic-complexity.md b/.changeset/cyclomatic-complexity.md new file mode 100644 index 0000000..5b63b9f --- /dev/null +++ b/.changeset/cyclomatic-complexity.md @@ -0,0 +1,27 @@ +--- +"@stainless-code/codemap": patch +--- + +feat(complexity): cyclomatic complexity column on `symbols` + bundled recipe (research note § 1.4 ship-pick (c)) + +Adds per-function cyclomatic complexity computed during AST walking. Schema bump `SCHEMA_VERSION` 7 → 8 — first reindex after upgrade triggers a full rebuild via the existing version-mismatch path. + +**What lands:** + +- New `complexity REAL` column on `symbols`. Computed via McCabe formula (`1 + decision points`) for function-shaped symbols (top-level `function` declarations + arrow-function consts). `NULL` for non-functions (interfaces, types, enums, plain consts) and class methods (v1 limitation; documented in the recipe `.md`). +- Decision points counted: `if`, `while`, `do…while`, `for`, `for…in`, `for…of`, `case X:` arms (not `default:` fall-through), `&&` / `||` / `??` short-circuit operators, `?:` ternary, `catch` clauses. +- New bundled recipe `high-complexity-untested` — function-shaped symbols with complexity ≥ 10 AND measured coverage < 50%. Combines structural + runtime evidence axes; surfaces refactor-priority candidates that single-axis recipes (`untested-and-dead`, `worst-covered-exports`) miss because they're "called but undertested." + +**Implementation:** + +- Parser visitor (`src/parser.ts`) maintains a `complexityStack` keyed by symbol index. On function entry, pushes counter at 1 + symbol index. Branching-node visitors increment the top counter. On function exit, pops and writes complexity into the symbol row already pushed during entry. +- Nested function declarations get their own stack entries — inner branches don't count toward the outer function. (Standard McCabe — each function counted independently.) + +**Pre-v1 patch** per `.agents/lessons.md` "changesets bump policy": schema-bumping changes are minor in semver but pre-v1 we default to patch unless the bump forces a `.codemap.db` rebuild. This one does (column added; auto-detected by `createSchema()` mismatch path) — every consumer's first run after upgrade re-indexes from scratch. + +Agent rule + skill lockstep updated per `docs/README.md` Rule 10 — both `templates/agents/` and `.agents/` codemap rule + skill mention the `complexity` column, the new recipe, and the cyclomatic-complexity definition. + +**Out of scope:** + +- **Class method complexity** — `MethodDefinition` visitor currently doesn't push to the complexity stack. Documented in `high-complexity-untested.md` v1 limitation; refactor opportunity for class-heavy projects. +- **Per-class / per-file rollups** — `complexity` is per-symbol; project-local recipes can `SUM` / `AVG` it as needed. diff --git a/README.md b/README.md index 9c35d59..508413b 100644 --- a/README.md +++ b/README.md @@ -110,12 +110,21 @@ codemap audit --base v1.0.0 --files-baseline pre-release-files # mix --base wit # non-git projects get a clean `--base requires a git repository` error. # Recipes that define per-row action templates append "actions" hints (kebab-case verb + # description) in --json output; ad-hoc SQL never carries actions. Inspect via --recipes-json. -# --format — pipe results into GitHub Code Scanning (SARIF -# 2.1.0) or surface findings inline on PRs (GH Actions ::notice file=…,line=…::msg). Both -# require a flat row list (no --summary / --group-by / baseline). Auto-detects file_path / -# path / to_path / from_path; rule.id is codemap. (or codemap.adhoc for ad-hoc). +# --format — pipe results into GitHub Code Scanning +# (SARIF 2.1.0), surface findings inline on PRs (GH Actions ::notice file=…,line=…::msg), or +# render edge-shaped recipes as Mermaid `flowchart LR`. All four require a flat row list +# (no --summary / --group-by / baseline). SARIF / annotations auto-detect file_path / +# path / to_path / from_path; rule.id is codemap. (or codemap.adhoc). Mermaid +# requires {from, to, label?, kind?} rows and rejects unbounded inputs (>50 edges) with a +# scope-suggestion error — alias columns via SELECT col AS "from", col2 AS "to". codemap query --recipe deprecated-symbols --format sarif > findings.sarif codemap query --recipe deprecated-symbols --format annotations # one ::notice per row +codemap query --format mermaid 'SELECT from_path AS "from", to_path AS "to" FROM dependencies LIMIT 50' +# --with-fts — opt-in FTS5 virtual table populated at index time. Default OFF (preserves +# .codemap/index.db size); CLI flag wins over codemap.config.ts `fts5` field. Toggle change +# auto-detects and forces a full rebuild so `source_fts` stays consistent. +codemap --with-fts --full +codemap query --recipe text-in-deprecated-functions # demonstrates FTS5 ⨯ symbols ⨯ coverage JOIN # HTTP API — same tool taxonomy as `codemap mcp`, exposed over POST /tool/{name} for # non-MCP consumers (CI scripts, curl, IDE plugins). Loopback default; optional --token. TOKEN=$(openssl rand -hex 32) diff --git a/docs/architecture.md b/docs/architecture.md index 4e0a539..a275ad9 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -119,7 +119,7 @@ A local SQLite database (`.codemap/index.db`) indexes the project tree and store **Query wiring:** **`src/cli/cmd-query.ts`** (argv, **`printQueryResult`**, `--recipe` / `-r` alias, **`--summary`**, **`--changed-since`**, **`--group-by`**, **`--save-baseline`** / **`--baseline`** / **`--baselines`** / **`--drop-baseline`**), **`src/application/query-recipes.ts`** (**`QUERY_RECIPES`** — bundled SQL only source; optional **`actions: RecipeAction[]`** per recipe), **`src/cli/main.ts`** (**`--recipes-json`** / **`--print-sql`** exit before config/DB). With **`--json`**, errors use **`{"error":"…"}`** on stdout for SQL failures, DB open, and bootstrap (same shape); **`runQueryCmd`** sets **`process.exitCode`** instead of **`process.exit`**. Friendlier "no `.codemap/index.db`" — `no such table: ` and `no such column: ` errors are rewritten in **`enrichQueryError`** to point at `codemap` / `codemap --full`. **`--summary`** filters output only — the SQL still executes against the index; output collapses to `{"count": N}` (with `--json`) or `count: N`. **`--changed-since `** post-filters result rows by `path` / `file_path` / `from_path` / `to_path` / `resolved_path` against `git diff --name-only ...HEAD ∪ git status --porcelain` (helper: **`src/git-changed.ts`** — `getFilesChangedSince`, `filterRowsByChangedFiles`, `PATH_COLUMNS`); rows with no recognised path column pass through. **`--group-by `** (`owner` | `directory` | `package`) routes through **`runGroupedQuery`** in `cmd-query.ts` and emits `{"group_by": "", "groups": [{key, count, rows}]}` (or `[{key, count}]` with `--summary`); helpers in **`src/group-by.ts`** (`groupRowsBy`, `firstDirectory`, `loadCodeowners`, `discoverWorkspaceRoots`, `makePackageBucketizer`, `codeownersGlobToRegex`). CODEOWNERS lookup is last-match-wins (GitHub semantics); workspace discovery reads `package.json` `workspaces` and `pnpm-workspace.yaml` `packages:`. **`--save-baseline[=]`** snapshots the result to the **`query_baselines`** table inside `.codemap/index.db` (no parallel JSON files; survives `--full` / SCHEMA bumps because the table is intentionally absent from `dropAll()`); name defaults to `--recipe` id, ad-hoc SQL needs an explicit name. **`--baseline[=]`** replays the SQL, fetches the saved row set, and emits `{baseline:{...}, current_row_count, added: [...], removed: [...]}` (or `{baseline:{...}, current_row_count, added: N, removed: N}` with `--summary`); identity is per-row multiset equality (canonical `JSON.stringify` keyed frequency map — duplicate rows are tracked, not collapsed). No fuzzy "changed" category in v1. **`--group-by` is mutually exclusive** with both `--save-baseline` and `--baseline` (different output shapes). **`--baselines`** (read-only list) and **`--drop-baseline `** complete the surface; helpers in **`src/db.ts`** (`upsertQueryBaseline`, `getQueryBaseline`, `listQueryBaselines`, `deleteQueryBaseline`). **Per-row recipe `actions`** are appended only when the user runs **`--recipe `** with **`--json`** AND the recipe defines an `actions` template — programmatic `cm.query(sql)` and ad-hoc CLI SQL never carry actions; under `--baseline`, actions attach to `added` rows only (the rows the agent should act on). The **`components-by-hooks`** recipe ranks by hook count with a **comma-based tally** on **`hooks_used`** (no SQLite JSON1). Shipped **`templates/agents/`** documents **`codemap query --json`** as the primary agent example ([README § CLI](../README.md#cli)). -**Output formatters:** **`src/application/output-formatters.ts`** — pure transport-agnostic; **`formatSarif`** emits SARIF 2.1.0 (auto-detected location columns: `file_path` / `path` / `to_path` / `from_path` priority + optional `line_start` / `line_end` region; `rule.id = codemap.` for `--recipe`, `codemap.adhoc` for ad-hoc SQL; aggregate recipes without locations → `results: []` + stderr warning); **`formatAnnotations`** emits `::notice file=…,line=…::msg` GitHub Actions workflow commands (one line per locatable row; messages collapsed to a single line because the GH parser stops at the first newline). Wired into both **`src/cli/cmd-query.ts`** (`--format `; `--format` overrides `--json`; `sarif` / `annotations` reject `--summary` / `--group-by` / baseline at parse time) and the MCP **`query`** / **`query_recipe`** tools (`format: "sarif" | "annotations"` with the same incompatibility guard). Per-recipe `sarifLevel` / `sarifMessage` / `sarifRuleId` overrides via frontmatter on `.md` deferred to v1.x. +**Output formatters:** **`src/application/output-formatters.ts`** — pure transport-agnostic; **`formatSarif`** emits SARIF 2.1.0 (auto-detected location columns: `file_path` / `path` / `to_path` / `from_path` priority + optional `line_start` / `line_end` region; `rule.id = codemap.` for `--recipe`, `codemap.adhoc` for ad-hoc SQL; aggregate recipes without locations → `results: []` + stderr warning); **`formatAnnotations`** emits `::notice file=…,line=…::msg` GitHub Actions workflow commands (one line per locatable row; messages collapsed to a single line because the GH parser stops at the first newline); **`formatMermaid`** emits a `flowchart LR` from `{from, to, label?, kind?}` rows with a hard `MERMAID_MAX_EDGES = 50` ceiling — unbounded inputs reject with a scope-suggestion error naming the recipe + count + `LIMIT` / `--via` / `WHERE` knobs (auto-truncation deliberately out of scope; would be a verdict masquerading as output mode). Wired into both **`src/cli/cmd-query.ts`** (`--format `; `--format` overrides `--json`; `sarif` / `annotations` / `mermaid` reject `--summary` / `--group-by` / baseline at parse time) and the MCP **`query`** / **`query_recipe`** tools (`format: "sarif" | "annotations" | "mermaid"` with the same incompatibility guard). Per-recipe `sarifLevel` / `sarifMessage` / `sarifRuleId` overrides via frontmatter on `.md` deferred to v1.x. **Validate wiring:** **`src/cli/cmd-validate.ts`** (argv + render) + **`src/application/validate-engine.ts`** (engine — **`computeValidateRows`** + **`toProjectRelative`**). `computeValidateRows` is a pure function over `(db, projectRoot, paths)` returning `{path, status}` rows where `status ∈ stale | missing | unindexed`. CLI wraps it with read-once-and-print + exits **1** on any drift (git-status semantics). Path normalization: **`toProjectRelative`** converts CLI input to POSIX-style relative keys matching the `files.path` storage format (Windows backslash → forward slash); same convention as `lint-staged.config.js`. Also reused by `cmd-show.ts` / `cmd-snippet.ts` and the MCP show/snippet handlers — single canonical implementation. @@ -179,7 +179,7 @@ Optional **`/config.{ts,js,json}`** (default `.codemap/config.*`; def **Fresh database:** the default CLI **`codemap`** (incremental) calls **`createSchema()`** in **`runCodemapIndex`** before **`getChangedFiles()`**, so the **`meta`** table exists before **`getMeta(..., "last_indexed_commit")`** runs on an empty **`.codemap/index.db`**. -Current schema version: **7** — see [Schema Versioning](#schema-versioning) for details. +Current schema version: **8** — see [Schema Versioning](#schema-versioning) for details. All tables use `STRICT` mode. Tables marked with `WITHOUT ROWID` store data directly in the primary key B-tree. PRAGMAs and index design: [SQLite Performance Configuration](#sqlite-performance-configuration). @@ -198,7 +198,7 @@ All tables use `STRICT` mode. Tables marked with `WITHOUT ROWID` store data dire ### `symbols` — Functions, constants, classes, interfaces, types, enums (`STRICT`) | Column | Type | Description | -| ----------------- | ---------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| ----------------- | ---------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --- | ------------------------------------------------------------------------------------- | | id | INTEGER PK | Auto-increment row id | | file_path | TEXT FK | References `files(path)` ON DELETE CASCADE | | name | TEXT | Symbol name | @@ -213,6 +213,7 @@ All tables use `STRICT` mode. Tables marked with `WITHOUT ROWID` store data dire | value | TEXT | Literal value for `const` declarations (strings, numbers, booleans, `null`). NULL for non-literal or non-const symbols. Handles `as const` and simple template literals | | parent_name | TEXT | Name of the enclosing symbol (class, function) for nested symbols. NULL for top-level (module scope). Class methods/properties point to their class | | visibility | TEXT | JSDoc visibility tag derived from `doc_comment` at parse time: `public` / `private` / `internal` / `alpha` / `beta`. NULL when no tag present. Tag must start its own line (after the JSDoc `*` prefix); first match in document order wins. Powers the `visibility-tags` recipe and `WHERE visibility = ?` queries via the partial index `idx_symbols_visibility` | +| complexity | REAL | Cyclomatic complexity (McCabe; `1 + decision points`) for function-shaped symbols only. NULL for non-functions (interfaces, types, enums, plain consts) and class methods (v1 limitation). Decision points: `if`, `while`, `do…while`, `for`, `for…in`, `for…of`, `case X:` arms (not `default:`), short-circuit `&&` / ` | | `/`??`, ternary `?:`, and `catch`clauses. Powers the`high-complexity-untested` recipe | ### `calls` — Function-scoped call edges, deduped per file (`STRICT`) diff --git a/docs/glossary.md b/docs/glossary.md index 9a8dba1..520780f 100644 --- a/docs/glossary.md +++ b/docs/glossary.md @@ -99,6 +99,10 @@ CLI subcommand comparing on-disk SHA-256 against `files.content_hash`. Statuses: React components (PascalCase + JSX return or hook usage). PascalCase functions that neither return JSX nor call hooks stay in `symbols` only — never `components`. `hooks_used` is JSON-encoded. See `ComponentRow`. +### `symbols.complexity` / cyclomatic complexity / McCabe + +Per-function decision-point count (REAL column on `symbols`). Computed by the parser walker (`src/parser.ts`) per the McCabe formula: `1 + (decision points)`. Counted nodes: `if`, `while`, `do…while`, `for`, `for…in`, `for…of`, `case X:` (not `default:` — that's the fall-through arm, not a decision), `&&`, `||`, `??`, `?:`, `catch`. Function-shaped symbols only — non-functions (interfaces, types, enums, plain consts) and class methods get `complexity = NULL` (v1 limitation; class methods tracked under `high-complexity-untested.md`). Joins to `coverage` via `(file_path, name, line_start)` natural key for the bundled `high-complexity-untested` recipe (complexity ≥ 10 ⨯ coverage < 50%). + ### `source_fts` (FTS5 virtual table) / `--with-fts` / opt-in full-text Opt-in FTS5 virtual table over file content (`tokenize='porter unicode61'`). Always created (near-zero space when empty); populated only when the resolved config has FTS5 enabled (`codemap.config.ts` `fts5: true` OR `--with-fts` CLI flag at index time; CLI wins, logs stderr override). Demonstrates the FTS5 ⨯ `symbols` ⨯ `coverage` JOIN composability that ripgrep can't match — bundled recipe `text-in-deprecated-functions` exemplifies the JOIN. Toggle change auto-detects via `meta.fts5_enabled` and forces a full rebuild so `source_fts` is consistently populated. Stderr telemetry `[fts5] source_fts populated: files / KB` on first populate. Distinct from arbitrary full-text storage — the table is structurally identical to `coverage` (both `WITHOUT ROWID`-class virtual tables in the substrate). Default OFF preserves `.codemap/index.db` size for non-users (~30–50% growth on text-heavy projects). diff --git a/src/db.ts b/src/db.ts index 0a81d18..723e9f9 100644 --- a/src/db.ts +++ b/src/db.ts @@ -2,7 +2,7 @@ import { openCodemapDatabase } from "./sqlite-db"; import type { CodemapDatabase, BindValues } from "./sqlite-db"; /** Bump on any DDL change; `createSchema()` auto-rebuilds on mismatch. */ -export const SCHEMA_VERSION = 7; +export const SCHEMA_VERSION = 8; /** * `meta` key tracking the FTS5 state at the last reindex; mismatch with the @@ -54,7 +54,8 @@ export function createTables(db: CodemapDatabase) { doc_comment TEXT, value TEXT, parent_name TEXT, - visibility TEXT + visibility TEXT, + complexity REAL ) STRICT; CREATE TABLE IF NOT EXISTS imports ( @@ -394,6 +395,14 @@ export interface SymbolRow { * in document order wins when multiple tags are present. */ visibility: string | null; + /** + * Cyclomatic complexity (1 + branching nodes). Function-shaped symbols + * only; `null` for non-function kinds (interfaces, types, enums, plain + * consts) and for symbols without a walked body. Optional for back- + * compat with callers that built `SymbolRow` literals before the + * column existed; absence binds as `null`. + */ + complexity?: number | null; } const BATCH_SIZE = 500; @@ -426,8 +435,8 @@ export function insertSymbols(db: CodemapDatabase, symbols: SymbolRow[]) { batchInsert( db, symbols, - "INSERT INTO symbols (file_path, name, kind, line_start, line_end, signature, is_exported, is_default_export, members, doc_comment, value, parent_name, visibility)", - "(?,?,?,?,?,?,?,?,?,?,?,?,?)", + "INSERT INTO symbols (file_path, name, kind, line_start, line_end, signature, is_exported, is_default_export, members, doc_comment, value, parent_name, visibility, complexity)", + "(?,?,?,?,?,?,?,?,?,?,?,?,?,?)", (s, v) => v.push( s.file_path, @@ -443,6 +452,7 @@ export function insertSymbols(db: CodemapDatabase, symbols: SymbolRow[]) { s.value, s.parent_name, s.visibility, + s.complexity ?? null, ), ); } diff --git a/src/parser.ts b/src/parser.ts index f0dd560..9cba067 100644 --- a/src/parser.ts +++ b/src/parser.ts @@ -131,6 +131,29 @@ export function extractFileData( _scopeStr = scopeStack.join("."); }; + // `symbolIndex = -1` marks anonymous functions (callbacks, IIFEs) — counted + // but never persisted, so their branches don't bleed into outer scopes. + // `arrowFnSymbolIndex` maps each named-arrow init node back to its symbol + // row index — must push from the function-shaped visitors (not the + // VariableDeclaration loop) so multi-declarator `const a = () => …, + // b = () => …` shapes attribute branches per-function, not per-statement. + const complexityStack: { symbolIndex: number; count: number }[] = []; + const arrowFnSymbolIndex = new WeakMap(); + const pushComplexityFor = (symbolIndex: number) => { + complexityStack.push({ symbolIndex, count: 1 }); + }; + const popComplexityTop = () => { + const top = complexityStack.pop(); + if (!top) return; + if (top.symbolIndex >= 0) { + symbols[top.symbolIndex].complexity = top.count; + } + }; + const incrementComplexity = () => { + const top = complexityStack[complexityStack.length - 1]; + if (top) top.count++; + }; + const visitor = new Visitor({ FunctionDeclaration(node: any) { const name = node.id?.name; @@ -141,6 +164,7 @@ export function extractFileData( exportedNames.has(name) || defaultExportedNames.has(name); const isDefault = defaultExportedNames.has(name); + const symbolIndex = symbols.length; symbols.push({ file_path: relPath, name, @@ -156,6 +180,7 @@ export function extractFileData( parent_name: currentParent(), visibility: null, }); + pushComplexityFor(symbolIndex); scopePush(name); if (isTsx && RE_COMPONENT.test(name)) { @@ -168,6 +193,7 @@ export function extractFileData( if (name && scopeStack[scopeStack.length - 1] === name) { scopePop(); } + popComplexityTop(); if (name && currentFunctionScope === name) { maybeAddComponent(name, node, false); currentFunctionScope = null; @@ -189,6 +215,7 @@ export function extractFileData( init?.type === "ArrowFunctionExpression" || init?.type === "FunctionExpression"; + const symbolIndex = symbols.length; symbols.push({ file_path: relPath, name, @@ -209,6 +236,7 @@ export function extractFileData( if (isArrowOrFn) { scopePush(name); + if (init) arrowFnSymbolIndex.set(init, symbolIndex); } if (isTsx && RE_COMPONENT.test(name) && isArrowOrFn) { currentFunctionScope = name; @@ -236,6 +264,19 @@ export function extractFileData( } }, + ArrowFunctionExpression(node: any) { + pushComplexityFor(arrowFnSymbolIndex.get(node) ?? -1); + }, + "ArrowFunctionExpression:exit"() { + popComplexityTop(); + }, + FunctionExpression(node: any) { + pushComplexityFor(arrowFnSymbolIndex.get(node) ?? -1); + }, + "FunctionExpression:exit"() { + popComplexityTop(); + }, + TSTypeAliasDeclaration(node: any) { const name = node.id?.name; if (!name) return; @@ -458,6 +499,34 @@ export function extractFileData( JSXFragment() { if (currentFunctionScope) jsxScopes.add(currentFunctionScope); }, + + // Cyclomatic-complexity branching nodes — each adds 1 to the + // currently-walked function's count. Standard McCabe formula: + // CC = 1 + (#decision points). Tracks if/loops/case/catch/&&/||/??/?:. + IfStatement: incrementComplexity, + WhileStatement: incrementComplexity, + DoWhileStatement: incrementComplexity, + ForStatement: incrementComplexity, + ForInStatement: incrementComplexity, + ForOfStatement: incrementComplexity, + ConditionalExpression: incrementComplexity, // `a ? b : c` + CatchClause: incrementComplexity, + SwitchCase(node: any) { + // `default:` is the fall-through arm, not a decision point — only + // count `case X:` arms. + if (node.test !== null && node.test !== undefined) incrementComplexity(); + }, + LogicalExpression(node: any) { + // `&&`, `||`, `??` introduce branching paths; `&` / `|` are bitwise + // (not decision points; AST shapes them as BinaryExpression). + if ( + node.operator === "&&" || + node.operator === "||" || + node.operator === "??" + ) { + incrementComplexity(); + } + }, }); visitor.visit(result.program); diff --git a/templates/agents/rules/codemap.md b/templates/agents/rules/codemap.md index 048c0bf..895b98b 100644 --- a/templates/agents/rules/codemap.md +++ b/templates/agents/rules/codemap.md @@ -105,30 +105,32 @@ Violating this order is wrong even if you get the right answer — it wastes tim If the question looks like any of these → use the index: -| Question shape | Table(s) | -| ------------------------------------------------------------- | -------------------------------------------------------- | -| "What/which files import X?" | `imports` (by `source`) or `dependencies` (by `to_path`) | -| "Where is X defined?" | `symbols` | -| "What does file X export?" | `exports` | -| "What hooks does component X use?" / "List React components" | `components` | -| "What are the CSS variables/tokens for X?" | `css_variables` | -| "Find all TODOs/FIXMEs" | `markers` | -| "Who depends on file X?" / "What does file X depend on?" | `dependencies` | -| "How many files/symbols/components are there?" | any table with `COUNT(*)` | -| "What are the CSS classes in X?" | `css_classes` | -| "What keyframe animations exist?" | `css_keyframes` | -| "What fields does interface/type X have?" | `type_members` | -| "Is symbol X deprecated?" / "What does X do?" | `symbols` (`doc_comment`) | -| "What's `@internal` / `@beta` / `@alpha` / `@private`?" | `symbols.visibility` (parsed JSDoc tag — not regex) | -| "Who calls X?" / "What does X call?" | `calls` | -| "Is symbol X tested?" / "What's the coverage of file Y?" | `coverage` (after `ingest-coverage`) | -| "What's structurally dead AND untested?" | `--recipe untested-and-dead` | -| "Rank files by test coverage" | `--recipe files-by-coverage` | -| "Worst-covered exported functions" | `--recipe worst-covered-exports` | -| "Which components touch deprecated APIs?" | `--recipe components-touching-deprecated` | -| "What's risky to refactor right now?" | `--recipe refactor-risk-ranking` | -| "Which exports has nobody imported?" | `--recipe unimported-exports` | -| "Find @deprecated functions with TODO/FIXME and low coverage" | `--recipe text-in-deprecated-functions` (needs FTS5 on) | +| Question shape | Table(s) | +| ------------------------------------------------------------- | --------------------------------------------------------- | +| "What/which files import X?" | `imports` (by `source`) or `dependencies` (by `to_path`) | +| "Where is X defined?" | `symbols` | +| "What does file X export?" | `exports` | +| "What hooks does component X use?" / "List React components" | `components` | +| "What are the CSS variables/tokens for X?" | `css_variables` | +| "Find all TODOs/FIXMEs" | `markers` | +| "Who depends on file X?" / "What does file X depend on?" | `dependencies` | +| "How many files/symbols/components are there?" | any table with `COUNT(*)` | +| "What are the CSS classes in X?" | `css_classes` | +| "What keyframe animations exist?" | `css_keyframes` | +| "What fields does interface/type X have?" | `type_members` | +| "Is symbol X deprecated?" / "What does X do?" | `symbols` (`doc_comment`) | +| "What's `@internal` / `@beta` / `@alpha` / `@private`?" | `symbols.visibility` (parsed JSDoc tag — not regex) | +| "Who calls X?" / "What does X call?" | `calls` | +| "Is symbol X tested?" / "What's the coverage of file Y?" | `coverage` (after `ingest-coverage`) | +| "What's structurally dead AND untested?" | `--recipe untested-and-dead` | +| "Rank files by test coverage" | `--recipe files-by-coverage` | +| "Worst-covered exported functions" | `--recipe worst-covered-exports` | +| "Which components touch deprecated APIs?" | `--recipe components-touching-deprecated` | +| "What's risky to refactor right now?" | `--recipe refactor-risk-ranking` | +| "Which exports has nobody imported?" | `--recipe unimported-exports` | +| "Find @deprecated functions with TODO/FIXME and low coverage" | `--recipe text-in-deprecated-functions` (needs FTS5 on) | +| "What's high-complexity AND undertested?" | `--recipe high-complexity-untested` | +| "What's the cyclomatic complexity of symbol X?" | `SELECT name, complexity FROM symbols WHERE name = '...'` | ## When Grep / Read IS appropriate diff --git a/templates/agents/skills/codemap/SKILL.md b/templates/agents/skills/codemap/SKILL.md index 93aab68..9986010 100644 --- a/templates/agents/skills/codemap/SKILL.md +++ b/templates/agents/skills/codemap/SKILL.md @@ -34,7 +34,7 @@ Use **`codemap --root /path/to/project`** (or **`CODEMAP_ROOT`**) to index anoth Replace placeholders (`'...'`) with your module path, file glob, or symbol name. -**CLI shortcuts:** **`codemap query --json --recipe `** runs bundled SQL (preferred for agents). **`codemap query --recipe `** without **`--json`** prints a table. **`codemap query --recipes-json`** prints every bundled recipe (**`id`**, **`description`**, **`sql`**, optional **`actions`**) as JSON (no index / DB required). **`codemap query --print-sql `** prints one recipe’s SQL only. Ids include **`fan-out`**, **`fan-out-sample`** (**`GROUP_CONCAT`** samples), **`fan-out-sample-json`** (same, but **`json_group_array`** — needs SQLite JSON1), **`fan-in`**, **`index-summary`**, **`files-largest`**, **`components-by-hooks`**, **`components-touching-deprecated`** (UNION of hook + call paths to `@deprecated` symbols), **`markers-by-kind`**, **`deprecated-symbols`**, **`refactor-risk-ranking`** (per-file `(fan_in + 1) × (100 - avg_coverage_pct)`), **`text-in-deprecated-functions`** (FTS5 ⨯ symbols ⨯ coverage demo — needs `--with-fts` enabled), **`unimported-exports`** (exports with no detectable importer; v1 doesn't follow re-export chains — see recipe `.md` for caveats), **`visibility-tags`**, **`barrel-files`**, **`files-hashes`** — see **`codemap query --help`**. +**CLI shortcuts:** **`codemap query --json --recipe `** runs bundled SQL (preferred for agents). **`codemap query --recipe `** without **`--json`** prints a table. **`codemap query --recipes-json`** prints every bundled recipe (**`id`**, **`description`**, **`sql`**, optional **`actions`**) as JSON (no index / DB required). **`codemap query --print-sql `** prints one recipe’s SQL only. Ids include **`fan-out`**, **`fan-out-sample`** (**`GROUP_CONCAT`** samples), **`fan-out-sample-json`** (same, but **`json_group_array`** — needs SQLite JSON1), **`fan-in`**, **`index-summary`**, **`files-largest`**, **`components-by-hooks`**, **`components-touching-deprecated`** (UNION of hook + call paths to `@deprecated` symbols), **`markers-by-kind`**, **`deprecated-symbols`**, **`refactor-risk-ranking`** (per-file `(fan_in + 1) × (100 - avg_coverage_pct)`), **`high-complexity-untested`** (cyclomatic complexity ≥ 10 + coverage < 50%; per-function), **`text-in-deprecated-functions`** (FTS5 ⨯ symbols ⨯ coverage demo — needs `--with-fts` enabled), **`unimported-exports`** (exports with no detectable importer; v1 doesn't follow re-export chains — see recipe `.md` for caveats), **`visibility-tags`**, **`barrel-files`**, **`files-hashes`**, **`untested-and-dead`** (exported AND uncalled AND uncovered), **`files-by-coverage`** (per-file rollup of statement coverage), **`worst-covered-exports`** (lowest-covered exported symbols) — see **`codemap query --help`**. **Output flags** (compose with **`--recipe`** or ad-hoc SQL): @@ -132,22 +132,23 @@ LIMIT 10 ### `symbols` — Functions, types, interfaces, enums, constants, classes -| Column | Type | Description | -| ----------------- | ---------- | ---------------------------------------------------------------------------------------------------------------------------------- | -| id | INTEGER PK | Auto-increment ID | -| file_path | TEXT FK | References `files(path)` | -| name | TEXT | Symbol name | -| kind | TEXT | `function`, `class`, `type`, `interface`, `enum`, `const` | -| line_start | INTEGER | Start line (1-based) | -| line_end | INTEGER | End line (1-based) | -| signature | TEXT | Reconstructed signature with generics and return types | -| is_exported | INTEGER | 1 if exported | -| is_default_export | INTEGER | 1 if default export | -| members | TEXT | JSON enum members (NULL for non-enums) | -| doc_comment | TEXT | Leading JSDoc text (cleaned), NULL when absent | -| value | TEXT | Literal value for consts (`"ok"`, `42`, `true`, `null`) | -| parent_name | TEXT | Enclosing symbol name (class/function), NULL = top-level | -| visibility | TEXT | Line-leading JSDoc tag: `public` / `private` / `internal` / `alpha` / `beta`; NULL when absent. First match in document order wins | +| Column | Type | Description | +| ----------------- | ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --- | -------------------- | +| id | INTEGER PK | Auto-increment ID | +| file_path | TEXT FK | References `files(path)` | +| name | TEXT | Symbol name | +| kind | TEXT | `function`, `class`, `type`, `interface`, `enum`, `const` | +| line_start | INTEGER | Start line (1-based) | +| line_end | INTEGER | End line (1-based) | +| signature | TEXT | Reconstructed signature with generics and return types | +| is_exported | INTEGER | 1 if exported | +| is_default_export | INTEGER | 1 if default export | +| members | TEXT | JSON enum members (NULL for non-enums) | +| doc_comment | TEXT | Leading JSDoc text (cleaned), NULL when absent | +| value | TEXT | Literal value for consts (`"ok"`, `42`, `true`, `null`) | +| parent_name | TEXT | Enclosing symbol name (class/function), NULL = top-level | +| visibility | TEXT | Line-leading JSDoc tag: `public` / `private` / `internal` / `alpha` / `beta`; NULL when absent. First match in document order wins | +| complexity | REAL | Cyclomatic complexity (`1 + decision points`) for function-shaped symbols. NULL for non-functions and class methods (v1). Powers `--recipe high-complexity-untested`. Decision points: `if`, `while`, `do…while`, `for`/`for-in`/`for-of`, `case X:` (not `default:`), `&&`/` | | `/`??`/`?:`, `catch` | ### `calls` — Function-scoped call edges (deduped per file) diff --git a/templates/recipes/high-complexity-untested.md b/templates/recipes/high-complexity-untested.md new file mode 100644 index 0000000..b8823ed --- /dev/null +++ b/templates/recipes/high-complexity-untested.md @@ -0,0 +1,38 @@ +--- +actions: + - type: review-test-coverage + auto_fixable: false + description: "High-complexity function with low coverage — many decision points (if / loops / case / && / || / ?:) AND nobody's exercising them. Add tests before refactoring; bugs on edit are likely." +--- + +Functions with cyclomatic complexity `≥ 10` AND measured coverage `< 50%`. Combines two evidence axes — structural (complexity) and runtime (coverage) — to surface refactor-priority candidates that the single-axis recipes (`untested-and-dead`, `worst-covered-exports`) miss because they're "called but undertested." + +## Cyclomatic complexity (per `symbols.complexity`) + +McCabe formula: `1 + (decision points)`. Branching nodes counted by codemap's parser walker (`src/parser.ts`): + +- `if` / `while` / `do…while` / `for` / `for…in` / `for…of` +- `case X:` arms inside `switch` (the `default:` fall-through is **not** counted — it's not a decision point) +- `&&` / `||` / `??` short-circuit operators (`?` / `:` ternary too) +- `catch` clauses + +**Computed for function-shaped symbols only** — non-function kinds (interfaces, types, enums, plain consts) and class member methods get `complexity = NULL` and are excluded by `WHERE s.complexity IS NOT NULL`. + +## Why the joint signal + +- High complexity alone surfaces too many false positives — a heavily-branched config-loader or visitor pattern is fine if it's well-tested. +- Low coverage alone surfaces too many false positives — a one-line getter with 0% coverage is barely worth testing. +- The intersection is the actionable list: _complex code that nobody's exercising = bug magnet_. + +## Tuning axes for project-local overrides + +`/.codemap/recipes/high-complexity-untested.sql`: + +- **Complexity threshold**: change `>= 10` to project's risk-appetite (5 for strict; 15 for tolerant). +- **Coverage threshold**: change `< 50` to project's risk-appetite (`< 80` for strict). +- **Filter to a directory**: `AND s.file_path LIKE 'src/api/%'` to scope. +- **Include class members**: complexity is computed per top-level function; class methods currently inherit `null` (see "v1 limitation" below). + +## v1 limitation — class methods are NULL + +Complexity is currently computed for top-level `function` declarations and arrow-function consts. Class methods (`MethodDefinition`) follow the same shape but don't push to the complexity stack yet. Refactor the `MethodDefinition` visitor in `src/parser.ts` to call `pushComplexityFor` / `popComplexityInto` if class-heavy projects need this. diff --git a/templates/recipes/high-complexity-untested.sql b/templates/recipes/high-complexity-untested.sql new file mode 100644 index 0000000..36c55d7 --- /dev/null +++ b/templates/recipes/high-complexity-untested.sql @@ -0,0 +1,24 @@ +-- High-complexity functions with low test coverage — refactor / test priority. +-- Combines structural (cyclomatic complexity ≥ 10) with runtime (coverage < 50%): +-- the joint signal is "this function has many decision points AND nobody's +-- exercising them" — high risk for hidden bugs on edit. +-- Returns nothing useful until you've run `codemap ingest-coverage ` +-- (Istanbul or LCOV) — without coverage data every high-complexity symbol +-- appears regardless of testing. +SELECT + s.name, + s.kind, + s.file_path, + s.line_start, + s.line_end, + s.complexity, + ROUND(COALESCE(c.coverage_pct, 0), 1) AS coverage_pct +FROM symbols s +LEFT JOIN coverage c ON c.file_path = s.file_path + AND c.name = s.name + AND c.line_start = s.line_start +WHERE s.complexity IS NOT NULL + AND s.complexity >= 10 + AND COALESCE(c.coverage_pct, 0) < 50 +ORDER BY s.complexity DESC, s.file_path, s.name +LIMIT 30