Skip to content

feat(biomarkers): four new rules — unused_export, long_parameter_list, magic_number, hardcoded_url#5

Open
mschreib28 wants to merge 68 commits into
mainfrom
upstream/feat/biomarkers-more-rules
Open

feat(biomarkers): four new rules — unused_export, long_parameter_list, magic_number, hardcoded_url#5
mschreib28 wants to merge 68 commits into
mainfrom
upstream/feat/biomarkers-more-rules

Conversation

@mschreib28
Copy link
Copy Markdown
Owner

Summary\n\nExtends the biomarker engine (PR colbymchenry#125) with four new rules. Three operate on metrics gathered during the per-symbol AST walk; one is a cross-file graph rule.\n\n### Per-symbol additions to SymbolMetrics\n- paramCount — counted from the function/method's parameter list at AST walk time\n- magicNumberCount — numeric literals not in the trivial allow-list (0, 1, -1, 2, plus float variants)\n- hardcodedUrlCount — string literals matching http(s)|ws(s)|ftp|s3|gs URLs\n\n### New rules\n| Rule | Severity tiers |\n|------|----------------|\n| long_parameter_list | info=4 / warning=5 / error=7 (Clean Code thresholds) |\n| magic_number | info=3 / warning=5 / error=8 |\n| hardcoded_url | info on first / warning at 2 / error at 3+ |\n| unused_export | warning — single SQL query: nodes with is_exported=1 whose only incoming edges are contains/exports/imports/tests (or come from the same file) |\n\n### DB additions\n- findUnusedExports() — single graph query over the whole index\n- appendFindings(...) — insert without delete (needed by cross-file rules so they don't clobber per-file rule output)\n- clearFindingsByKind(...) — wipe one biomarker kind globally before re-emitting\n- replaceFindingsForFile updated to preserve unused_export findings on per-file replace, so sync runs that touch a file don't silently lose its cross-file findings without recomputing them\n\n### Edge-kind exclusions for unused_export\n"Used" means a real consumer edge from outside the symbol's file. Excluded as not-real-use:\n- contains — structural; every method is contained by its class\n- exports — re-export barrels (export { foo } from './a') forward but don't consume\n- imports — the import statement itself isn't a use; real use is calls/references/instantiates/extends/etc.\n- tests — convention-derived test→subject edges aren't a semantic API call\n\n## Test plan\n\n- [x] Live verification — synthetic stress fixture with one 7-param function using 5 magic numbers and 3 dead exports:\n \n fetchAll → long_parameter_list error (7), magic_number error (10)\n unused1, unused2, consumeFetch → unused_export warning\n \n- [x] Unit tests added — 6 new evaluateRules cases covering each new rule's trigger thresholds (long_parameter_list warning at 5 + clean at 3, magic_number error at 8 + clean at 2, hardcoded_url info/warning/error tiers + clean at 0). Plus a metrics() helper so future metric additions don't bloat the test suite.\n- [x] npx vitest run830 passed\n- [x] npx tsc --noEmit clean\n- [x] npm run build succeeds\n\n## Notes\n\n- Pre-existing clean code = 0 findings test fixture changed to use non-exported helpers (semantically those WERE unused exports under the new rule — calibration artefact, not a regression)\n- unused_export is skipped on partial scans (sync with filePaths) because it depends on global graph state; replaceFindingsForFile preserves existing unused_export rows on touched files so they remain valid until the next full pass\n- appendFindings + clearFindingsByKind are wrapped in a single queries.transaction(...) so concurrent readers never see the intermediate empty state\n\n🤖 Generated with Claude Code\n


Copied from colbymchenry/codegraph#132

andreinknv and others added 30 commits April 26, 2026 00:18
`git ls-files` (used for both the initial scan and incremental sync)
does not enter submodules — they appear as gitlink entries with their
contents invisible. As a result, source files inside submodules were
silently skipped during indexing.

Both file-discovery paths now recurse into active submodules:

  - getGitVisibleFiles (full index) enumerates active submodules via
    `git submodule foreach --recursive --quiet 'echo "$displaypath"'`
    and runs `git ls-files -co --exclude-standard` inside each, prefixing
    the submodule path so files are reported relative to the parent root.

  - getGitChangedFiles (sync) was refactored to share its status-parsing
    logic between the parent repo and each submodule. Submodule directory
    entries that the parent's status emits when a submodule pointer moves
    (e.g., " m vendor/sub") are filtered out so we don't try to read a
    directory as a file.

Submodule indexing is on by default and can be disabled via
`indexSubmodules: false` in CodeGraphConfig — useful for repos with
large vendor submodules that should remain unindexed without having to
add a path-based exclude. Uninitialized / missing submodules are
silently skipped (best-effort enhancement on top of the existing scan).

Status output paths are now C-style-unquoted before being used or
compared against the submodule directory set, so submodule paths
containing spaces or non-ASCII bytes are handled correctly. The parent
status command failing still falls back to the full filesystem scan via
a null return, preserving the prior contract; only submodule-internal
status failures are absorbed silently.

Closes colbymchenry#86.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a "Using with Other MCP Clients" section to the README with
copy-pastable config for opencode, Cursor, LangChain
(MultiServerMCPClient), and the Claude Agent SDK, plus a generic
stdio-MCP fallback. Each client gets the exact field names it
actually expects (the existing README only documented the Claude
Code / ~/.claude.json shape).

Notes:
- The CodeGraph MCP server speaks stdio only, so the LangChain
  example explicitly passes `transport: "stdio"` (the issue
  reporter had been trying to use SSE config) and there's a
  closing note pointing SSE-only clients at supergateway as a
  bridge.
- The generic-fallback section documents the `--path` flag for
  clients that don't send a `rootUri` in the initialize request.

Closes colbymchenry#65, colbymchenry#79.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds docs/ADDING-A-LANGUAGE.md walking through every step a contributor
needs to add a new language extractor:

  1. Source a tree-sitter wasm grammar — covers the three real-world
     paths (already in tree-sitter-wasms, pre-built release artifact,
     build from source via tree-sitter-cli's bundled wasi-sdk).
  2. Probe the AST with a small scratch script before writing code.
  3. Register in src/types.ts + src/extraction/grammars.ts.
  4. Type-check before adding extraction logic.
  5. Pick a pattern: LanguageExtractor config (procedural / OO) or a
     self-contained extractor class (declarative / template / non-OO).
  6. Map onto existing NodeKind / EdgeKind values.
  7. Tests + end-to-end CLI smoke.
  8. PR description checklist.

Each section points at the existing extractors as worked examples
(R for the OO path, HCL/SQL/Liquid for the custom path, Pascal+DFM
for the cross-format case). README.md and CLAUDE.md gain a one-line
pointer to the cookbook.

Closes colbymchenry#55.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four user-visible bugs caught by an independent audit pass:

1. Svelte symbols reported on the wrong line
   src/extraction/svelte-extractor.ts:144

   The script-block regex captures content starting with the leading newline
   that follows `>`, so the inner extractor sees that newline as line 1 of
   its 1-indexed input and the first real code on line 2. The previous
   `contentStartLine = scriptTagLine + openingTagLines + 1` was added to
   that 1-indexed line number, shifting every Svelte symbol's startLine /
   endLine off by 1. Drops the `+1`. Five regression tests added covering
   single-line, multi-line opening tag, template-offset, single-line
   no-newline, and dual module/instance script blocks.

2. Watcher silently dropped pending changes on sync failure
   src/sync/watcher.ts:177

   `hasChanges = false` ran before the sync attempt, so a thrown sync
   (DB locked, transient FS error) left the pending batch forgotten until
   a NEW file event arrived. Re-set `hasChanges = true` in the catch path
   so a transient failure schedules a retry on its own. Regression test
   added (mocks fail-then-succeed, asserts the second call happens
   without a new file event).

3. Graph traversal default maxDepth was Infinity
   src/graph/traversal.ts:14, src/types.ts:301

   `limit: 1000` capped returned nodes, but during traversal the visited
   set and BFS/DFS frontier can grow far beyond `limit` on highly
   connected graphs before the cap kicks in. Default is now 10. Callers
   who really need exhaustive traversal can still pass `maxDepth:
   Infinity` explicitly — the JSDoc documents this. This is a
   public-API behavior change; existing tests pass.

   Also caps `findPath`'s BFS queue at 100,000 entries
   (FIND_PATH_MAX_QUEUE) and returns null if exceeded — each entry holds
   a cloned path array, so on dense graphs the queue could otherwise
   consume gigabytes before either finding a path or exhausting the
   search.

4. `findRelevantContext` did not bound caller-supplied limits
   src/context/index.ts:284

   `searchLimit` is multiplied by 5 in `findNodesByExactName` and feeds
   several other unbounded operations; a caller passing
   `searchLimit: 1_000_000` would pull millions of rows. Now clamped:
   searchLimit ∈ [1, 100], maxNodes ∈ [1, 1000], traversalDepth ∈ [0, 10].
   Regression test asserts a 1e9 input is bounded.

All 387 tests pass serialized; tsc clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bundles seven smaller hardening fixes from the audit pass:

1. Symlink-resistant path validation in sync paths
   src/extraction/index.ts:1148, 1206, 1293, 1347
   src/utils.ts (new validatePathWithinRootReal)

   getChangedFiles() and sync() now resolve symlinks via realpath before
   reading, so a regular file swapped for a symlink to outside the
   project (between scan and read) is rejected rather than followed.
   The lexical-only validatePathWithinRoot stays for paths that don't
   yet exist on disk.

2. storeExtractionResult is now atomic
   src/extraction/index.ts:1018-1098, src/db/queries.ts (new
   QueryBuilder.transaction)

   delete + insert + upsert run in a single SQLite transaction, so a
   process kill mid-write can no longer leave the file's old data
   wiped while the new data is missing. Nested transactions inside
   insertNodes/insertEdges/insertUnresolvedRefsBatch/deleteFile work
   correctly because better-sqlite3 collapses them via SAVEPOINT.

3. recycleWorker is now actually async
   src/extraction/index.ts:573

   Previously declared sync but called via `await`. Now properly
   awaits worker.terminate() with a 1s timeout (so a wedged WASM
   worker can't block the caller indefinitely) and force-fires
   terminate after the timeout to avoid zombie threads.

4. Parse-worker handles unknown message types
   src/extraction/parse-worker.ts:60-78

   When a malformed message arrives with an `id`, the worker sends
   back a structured error instead of silently dropping it (which
   previously caused the main thread to block until the per-file
   timeout). Messages without an id are still ignored — no pending
   promise to unblock.

5. ReDoS-safe glob matching
   src/utils.ts (new globToSafeRegex), src/bin/codegraph.ts:1163,
   src/graph/queries.ts:196

   Coalesces runs of `*` so hostile inputs like `*****` map to a
   single `.*` rather than five chained quantifiers (which would
   catastrophically backtrack on long inputs). Caps input at 1024
   chars. Implementation walks character-by-character with no
   sentinels — avoids any chance of marker-string collisions with
   user input.

6. fileUriToPath fallback hardening
   src/mcp/index.ts:36-40

   Catch-block fallback for malformed file:// URIs now runs through
   path.resolve() so a `file:///../etc/passwd` style input is
   normalized rather than passed raw to downstream filesystem code.

7. Reference dedup at extraction time + dead-code removal
   src/extraction/tree-sitter.ts (new dedupeReferences),
   src/db/queries.ts (deleteUnresolvedByNode removed)

   A function that calls `foo()` 100 times now produces 1 unresolved
   reference instead of 100; the resolver collapses duplicates anyway
   but extraction was wasteful. Removed the never-called
   deleteUnresolvedByNode helper (FK cascade handles cleanup).

3 new tests for globToSafeRegex; full suite 383/383 passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sync used to rely solely on `git status --porcelain` for change detection,
which only reports working-tree dirtiness vs HEAD. After a `git merge`
(or pull, checkout, rebase, reset, post-commit), the working tree is
clean and `git status` reports nothing, so sync silently became a no-op
while the DB still held pre-operation content hashes. MCP queries then
served stale data with no warning.

Sync now records the HEAD SHA it was last synced against (in the existing
project_metadata table) and, when current HEAD differs, unions
`git diff --name-status <last>..HEAD` into the changed-file set. If the
recorded HEAD is unreachable (force-push, gc), sync falls back to the
filesystem scan path, which is correct regardless of git history state.

The same fix is applied to getChangedFiles() so MCP staleness signals
stay accurate.

Adds 5 regression tests covering merge, branch checkout, committed
deletion, unreachable last-synced HEAD, and the no-op clean-tree case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ork regex)

Three accuracy bugs caught by an audit pass, bundled in one PR.

### 1. UTF-8 BOM caused spurious "modified" hash mismatches

`src/extraction/index.ts` (hashContent)

A file written with a BOM by one editor and re-saved without a BOM by
another (VSCode strips by default; some Windows editors preserve it)
hashed to two different values. Sync then reported the file as modified
on every run. hashContent now strips a leading U+FEFF before hashing.

### 2. Parse-retry comment strip was a no-op for Python, Ruby, etc.

`src/extraction/index.ts` (last-resort retry path), `src/utils.ts`

The "shrink the file by removing comment-only lines" fallback used
`/^\s*\/\//.test(line)` for every language. Files whose comment marker
is not `//` (Python `#`, Ruby `#`) had nothing stripped, so the retry
ran the same content that had already crashed and the file silently
stayed unindexed. Added a per-language LINE_COMMENT_MARKER table and a
stripCommentLinesForRetry helper used at the retry call site.

### 3. Framework route extractors matched docstrings/comments

`src/resolution/frameworks/{python,express,laravel,rust,csharp}.ts`

`pattern.exec(content)` ran the route regex over raw file content, so a
route example in a Python docstring or a commented-out route in JS was
extracted as a real route node. AI assistants then saw phantom routes
that do not exist in the running app.

Added stripCommentsForRegex (utils.ts) which neutralizes block comments,
whole-line line comments, and (for Python) triple-quoted strings,
preserving newlines so match.index maps back to the original line
numbers. Applied at the top of every framework extractor that runs a
regex over content. Deliberately does NOT strip arbitrary string
literals, since those carry the actual route paths the regex needs.

Languages covered: js/ts/tsx/jsx, java, csharp, c/cpp, go, rust, swift,
kotlin, dart, scala, php, python, ruby, pascal.

## Files changed

| File | Change |
|---|---|
| src/utils.ts | Add stripBom, stripCommentsForRegex, stripCommentLinesForRetry, LINE_COMMENT_MARKER table |
| src/extraction/index.ts | Strip BOM in hashContent; use language-aware retry strip |
| src/resolution/frameworks/python.ts | Strip comments before django/flask/fastapi route regex |
| src/resolution/frameworks/express.ts | Strip comments before express route regex |
| src/resolution/frameworks/laravel.ts | Strip comments before laravel route regex |
| src/resolution/frameworks/rust.ts | Strip comments before actix/rocket/axum route regex |
| src/resolution/frameworks/csharp.ts | Strip comments before aspnet route regex |
| __tests__/extraction-resolution-accuracy.test.ts | 21 regression tests |

## Test plan

- [x] npm test: 400/400 pass on macOS (one pre-existing fs.watch flake under parallel load, passes in isolation)
- [x] npx tsc --noEmit clean
- [x] 21 new tests covering: BOM normalization, per-language line stripping, and false-positive-prevention for every affected framework

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer caught: the minimalApiPattern loop in csharp.ts (Map{Get,Post,...}
ASP.NET Core 6+ style) was not updated when the routePatterns loop above
it was switched to use the comment-stripped content, leaving commented-out
app.MapGet calls still being extracted as real routes.

Added a regression test asserting both line-comment and block-comment
forms are skipped for minimal-API routes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The edges table has `id INTEGER PRIMARY KEY AUTOINCREMENT` and no other
UNIQUE constraint. The codebase uses

    INSERT OR IGNORE INTO edges (source, target, kind, ...) VALUES (...)

clearly intending dedup, but the only candidate key for OR IGNORE was
the autoincrement id (which never conflicts) — so the OR IGNORE was a
silent no-op. Any code path that re-emits the same edge (resolver retries,
partial-failure re-runs, framework extractors that double-emit) silently
inserted duplicates, inflating call graphs in codegraph_callers/callees.

This change adds a real UNIQUE index on the natural key:

    UNIQUE INDEX idx_edges_unique
      ON edges(source, target, kind, COALESCE(line, -1), COALESCE(col, -1))

COALESCE keeps two NULL line/col values comparable as equal — SQLite
treats raw NULLs in a UNIQUE index as distinct, which would otherwise
defeat dedup for edges that don't carry a line/col (1-indexed everywhere
in this codebase, so -1 is a safe sentinel).

Migration v4 first deduplicates pre-existing rows (DELETE ... WHERE id
NOT IN (SELECT MIN(id) FROM edges GROUP BY source, target, kind,
COALESCE(line, -1), COALESCE(col, -1))) then creates the index. Both run
inside the migration transaction wrapper so a crash leaves the DB
consistent.

CURRENT_SCHEMA_VERSION bumped to 4. Two existing version-pinned tests
updated to match.

## Files changed

| File | Change |
|---|---|
| src/db/schema.sql | Add UNIQUE INDEX idx_edges_unique for fresh installs |
| src/db/migrations.ts | Bump version to 4; add migration v4 (dedup + index) |
| __tests__/edges-unique.test.ts | 7 regression tests |
| __tests__/foundation.test.ts | Update expected schema version |
| __tests__/pr19-improvements.test.ts | Update expected schema version |

## Test plan

- [x] npm test: 387/387 pass on macOS (one pre-existing fs.watch flake under parallel load, passes in isolation)
- [x] npx tsc --noEmit clean
- [x] Independent reviewer pass before pushing — APPROVE; nits-only

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The .codegraphignore marker (per-directory opt-out from indexing) was
respected by `scanDirectoryWalk` (the filesystem-walk fallback) but
silently ignored by `getGitVisibleFiles` (the git fast path) and
`getGitChangedFiles` (sync's git path). Same project gave different
file sets depending on whether `.git` existed — typically the marker
"worked" only on non-git scratch projects and was a no-op everywhere
else, which is the opposite of how most users encounter it.

This change adds two helpers in `src/extraction/index.ts`:

  - `findCodegraphIgnoredDirs(rootDir, files)` — walks parent directories
    of the given file list, returns the set of directories that contain
    a `.codegraphignore` marker. Walks once per unique parent directory,
    with an early-out on shared ancestors.

  - `isUnderCodegraphIgnoredDir(filePath, ignoredDirs)` — true if filePath
    lives under any of those dirs.

Applied in:
  - `scanDirectory` and `scanDirectoryAsync` — between the git file list
    and the include-pattern filter.
  - `getGitChangedFiles` — refactored to a two-pass collect-then-bucketize
    so the ignored-dir set is built once from the candidate paths.

The marker file itself does not need to be tracked by git — fs.existsSync
catches it whether it was committed or added as a local override.

## Files changed

| File | Change |
|---|---|
| src/extraction/index.ts | Add findCodegraphIgnoredDirs + isUnderCodegraphIgnoredDir; apply in scanDirectory, scanDirectoryAsync, getGitChangedFiles |
| __tests__/codegraphignore.test.ts | 6 regression tests |

## Test plan

- [x] npm test: 386/386 pass on macOS (one pre-existing fs.watch flake under parallel load, passes in isolation)
- [x] npx tsc --noEmit clean
- [x] Independent reviewer pass before pushing — APPROVE; addressed two info-level cleanups (JSDoc accuracy, removed dead try/catch around fs.existsSync)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The codebase no longer ships embeddings (commit 453c39d), so all search
quality has to come from FTS. The maintainer's evidence in PR colbymchenry#74
documented several queries where FTS-only badly trailed semantic search
because the SQLite default tokenizer treats `getParser` as a single
indivisible token. Three changes that compound to fix that:

1. **Subword tokens.** New `name_subwords` column on `nodes` populated
   with the camel/snake split of the identifier (kept alongside the
   original) and indexed by FTS5 at weight 10x. A query for `parser`
   now finds `getParser` at the FTS layer, not just via post-hoc
   rescoring on the limited candidate set BM25 surfaces.

2. **Porter stemmer.** `tokenize="porter unicode61"` on the FTS table
   collapses morphological variants — `parser`/`parsing`/`parses` all
   stem to `pars` so a natural-language query matches identifier subwords
   and docstring prose alike.

3. **Stopword stripping.** `searchNodesFTS` now filters stopwords from
   the query before constructing the OR-join. Without this, words like
   `how` / `does` / `the` become OR'd FTS hits against any prose-bearing
   docstring and crowd out the actually-relevant identifier tokens.
   Reuses the existing `STOP_WORDS` set in src/search/query-utils.ts via
   a new shared `filterStopwords` helper.

## Empirical results (codegraph's own src/, 1242 nodes, 71 files)

| Query | baseline rank | this PR rank |
|---|---:|---:|
| `ExtractionOrchestrator` | 1 | 1 |
| `how does file parsing work` | NOT FOUND in 20 | 2 |
| `database connection management` | 18 | 1 |
| `resolves references between modules` | 19 | 2 |

Mean rank: ~14 → 1.5.

Concept-mode docstring re-weighting was tested as a fourth lever and
rejected — it regressed `how does file parsing work` because amplifying
docstring weight floods the result list with prose-keyword spam more
than it lifts truly relevant prose. Not included.

## Migration v4

Existing v3 databases get migrated by:
  - Adding the `name_subwords` column to `nodes` (idempotent guard so a
    re-run after partial DDL failure doesn't fail with "duplicate column")
  - Dropping the old FTS table + triggers (tokenize cannot be ALTERed)
  - Recreating FTS without triggers
  - Backfilling name_subwords for every existing node
  - Rebuilding the FTS index in one shot via `INSERT INTO nodes_fts(nodes_fts) VALUES('rebuild')`
  - Recreating the triggers afterward (so they don't fire mid-backfill,
    which corrupted FTS5 in earlier prototype runs)

## Files changed

| File | Change |
|---|---|
| `src/utils.ts` | Add `splitIdentifierTokens`, `buildNameSubwords` |
| `src/search/query-utils.ts` | Add shared `filterStopwords` helper using existing STOP_WORDS |
| `src/db/schema.sql` | Add `name_subwords` column, add it to nodes_fts, add `tokenize="porter unicode61"`, update triggers |
| `src/db/migrations.ts` | Bump version to 4; add migration v4 with idempotent ALTER guard |
| `src/db/queries.ts` | Populate name_subwords on insert/update; new BM25 weights; stopword filter in searchNodesFTS |
| `__tests__/foundation.test.ts`, `__tests__/pr19-improvements.test.ts` | Update expected schema version |
| `__tests__/search-quality.test.ts` | 21 regression tests including helpers, end-to-end search, full v3-to-v4 migration, and migration idempotency |

## Test plan

- [x] `npm test`: 404/404 pass on macOS (one pre-existing fs.watch flake under parallel load, passes in isolation)
- [x] `npx tsc --noEmit` clean
- [x] Bench script confirms targets at #18, #19, NOT-FOUND on baseline jump to #1, #2, #2 with this PR
- [x] Independent reviewer pass before pushing — addressed three findings:
  - merged duplicate stopword sets (now uses STOP_WORDS from query-utils.ts)
  - dedup tokens in buildNameSubwords (`parse` no longer stores `parse parse`)
  - made migration idempotent on partial-DDL re-run

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new graph layer that surfaces coupling between files which
historically change together — sibling extractors that share patterns,
tests that assert schema state, config files coupled to the code that
reads them. This information is invisible to static analysis but
recoverable from git in a few seconds with no model dependencies.

Empirically verified on codegraph's own history:

- src/extraction/languages/csharp.ts <-> java.ts  (J=0.75)
  No import relation; coupled because both implement language extractors
  that share patterns.
- __tests__/foundation.test.ts <-> src/db/schema.sql  (J=0.57)
  Test asserts schema version; coupled by intent, not by import.
- src/db/migrations.ts <-> src/db/schema.sql  (J=0.57)
  Schema changes always require a migration.

## Components

- **Schema**: new co_changes table (file_a, file_b, count) with canonical
  ordering (file_a < file_b) and CHECK constraint. New commit_count
  column on files for Jaccard normalisation. Migration v4 (idempotent
  PRAGMA-table_info guard on the ALTER).

- **Miner** (src/cochange/index.ts): mineCoChanges runs
  `git log --no-merges --name-only --format=tformat:CGCMT-%H -z`. Header
  sentinel CGCMT-<40-hex-sha> cannot collide with any realistic filename.
  MAX_FILES_PER_COMMIT=50 skips merge / mass-refactor commits that would
  otherwise produce O(N^2) spurious pairs. Returns needsFullRescan when
  the previously-mined HEAD is unreachable (force-push, gc).

- **DB methods**: applyCoChangeDeltas (UPSERT pairs + UPDATE per-file
  counts in one transaction), clearCoChanges, getCoChangedFiles. Jaccard
  is computed inline in SQL with proper ranking + LIMIT applied AFTER
  filtering so high-Jaccard pairs aren't silently dropped.

- **Wiring**: indexAll triggers full mining; sync triggers incremental
  mining since last_mined_cochange_head (stored in project_metadata,
  same pattern as PR colbymchenry#100's sync HEAD tracking).

- **Public API**: cg.getCoChangedFiles(filePath, { limit, minCount,
  minJaccard }) returns ranked partners with count + Jaccard.

- **Config**: enableCoChange?: boolean on CodeGraphConfig, default true.
  False skips mining entirely. Field is properly merged in mergeConfig
  so the opt-out survives a config save/load round-trip.

## Files changed

| File | Change |
|---|---|
| src/cochange/index.ts (NEW) | Miner: parses git log into pair counts |
| src/db/schema.sql | Add commit_count column + co_changes table + indexes |
| src/db/migrations.ts | Bump version to 4; add migration v4 (idempotent) |
| src/db/queries.ts | applyCoChangeDeltas / clearCoChanges / getCoChangedFiles |
| src/index.ts | mineCoChanges wired into indexAll + sync |
| src/types.ts | enableCoChange config field, default true |
| src/config.ts | mergeConfig now propagates enableCoChange |
| __tests__/cochange.test.ts (NEW) | 27 regression tests |
| __tests__/foundation.test.ts, __tests__/pr19-improvements.test.ts | Update expected schema version |

## Test plan

- [x] npm test: 405/407 pass (two pre-existing fs.watch flakes; watcher
      passes 12/12 in isolation)
- [x] npx tsc --noEmit clean
- [x] Independent reviewer pass before pushing — addressed three findings:
      - mergeConfig dropped enableCoChange (BLOCK → fixed; round-trip test added)
      - minJaccard was applied JS-side after a small SQL LIMIT, silently
        dropping high-Jaccard pairs ranked beyond the over-fetch
        (REQUEST_CHANGES → moved into SQL WHERE; regression test added)
      - parser sentinel `--` could collide with a file literally named `--`
        (REQUEST_CHANGES → switched to CGCMT-<sha>; regression test added)

## Notes

- Schema bump to v4 collides with PR colbymchenry#102 and PR colbymchenry#104 (both also v4). If
  any of those land first, this rebases to v5.
- HEAD-reset-behind-anchor (rewinding past last_mined without force-push)
  is a known limitation; merged-base ancestry check could detect it but
  isn't worth the complexity for this PR.
- Mining adds ~1-3s to indexAll on a typical repo. enableCoChange: false
  opts out entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new edge kind `tests` that links every test file to the source
file(s) it covers, derived from filename convention. Makes "what tests
this?" / "what does this test cover?" a one-call lookup that previously
required grepping the codebase.

Pure file→file relation between existing `kind='file'` nodes; no schema
change, no migration. Honest about what it can and can't resolve:
feature-themed tests with no obvious subject (`security.test.ts`,
`integration.test.ts`) get zero edges rather than a guess.

## Empirical results (codegraph's own __tests__)

Of 11 test files, 8 resolve cleanly to a single subject:

| Test | Subject |
|------|---------|
| __tests__/cochange.test.ts | src/cochange/index.ts |
| __tests__/context.test.ts | src/context/index.ts |
| __tests__/extraction.test.ts | src/extraction/index.ts |
| __tests__/graph.test.ts | src/graph/index.ts |
| __tests__/installer.test.ts | src/installer/index.ts |
| __tests__/resolution.test.ts | src/resolution/index.ts |
| __tests__/sync.test.ts | src/sync/index.ts |
| __tests__/watcher.test.ts | src/sync/watcher.ts (co-located) |

The 3 unresolved (foundation, pr19-improvements, security) are
correctly identified as multi-subject feature tests.

## Resolution strategy

`findTestSubjects(testFile, allFiles)` tries four steps in order, returning
the first matches found:

  1. Co-located: `path/foo.test.ts` → `path/foo.ts` or `path/foo/index.ts`
  2. Mirrored: `path/__tests__/bar.test.ts` → `path/bar.ts`
     (also strips top-level `tests/` and `spec/` prefixes)
  3. Common source roots: `__tests__/x.test.ts` → `src/x.ts`,
     `lib/x.ts`, `app/x.ts`, `packages/x.ts`
  4. Basename anywhere with prefix-tiebreaker

Returns [] if no match found; never guesses.

## Conventions handled

  foo.test.{ts,tsx,js,jsx,mjs,cjs}        Jest, Vitest
  foo.spec.{ts,tsx,js,jsx,mjs,cjs}        Jest, Vitest (alt suffix)
  test_foo.{py,rs}                        pytest, Rust
  foo_test.{go,py,rs}                     Go convention
  foo_{spec,test}.rb                      RSpec, Minitest
  FooTest.{java,kt,cs,swift}              xUnit
  FooTests.{java,kt,cs,swift}             xUnit (plural)
  FooSpec.{swift,kt}                      Quick, Spek

## Wiring

  - indexAll: rebuildTestEdges() after orchestrator success.
    Deletes all `tests` edges + re-resolves.
  - sync: rebuildTestEdgesForFiles(changed) when git fast path
    available; falls back to full rebuild otherwise.
  - Subject deletion handled by FK CASCADE on file node.

## Public API

  cg.getTestsForFile(filePath) → FileRecord[]
    Tests that cover this source file (incoming `tests` edges).
  cg.getSubjectsOfTest(testFilePath) → FileRecord[]
    Subject files of a test (outgoing `tests` edges).

## Files changed

| File | Change |
|---|---|
| src/types.ts | Add 'tests' to EdgeKind |
| src/tests-edges/index.ts (NEW) | Resolver: testSubjectBasename, isTestFile, findTestSubjects |
| src/db/queries.ts | deleteEdgesBySourceAndKind, deleteAllEdgesByKind (cached statements) |
| src/index.ts | rebuildTestEdges + rebuildTestEdgesForFiles wired into indexAll/sync; getTestsForFile, getSubjectsOfTest public API |
| __tests__/tests-edges.test.ts (NEW) | 29 regression tests |

## Test plan

  - [x] npm test: 408/409 pass (one pre-existing fs.watch flake; passes 12/12 in isolation)
  - [x] npx tsc --noEmit clean
  - [x] Bench against codegraph confirms 8/11 resolution rate
  - [x] Independent reviewer pass before pushing — addressed three findings:
    - Top-level `tests/` directory was not stripped from mirrored path
      (the slash-prefixed regex required a leading `/`); fixed by adding
      a leading-anchored alternative. Regression test with decoy file.
    - Deleted test files were re-resolved against an `allFiles` set that
      no longer contained them; now filtered to still-tracked files.
    - Two new DELETE methods skipped statement caching contrary to the
      established pattern; now cached in `this.stmts`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hods

When a query matches many symbols in a single file, current ranking
returns the matching class plus 9 of its members from the same file.
The first hit is informative; the next 9 are implementation detail
that pushes peer files (subclasses, callers, sibling modules) past the
limit. This PR caps results per file so search surfaces representative
breadth across the codebase rather than burying the user in one
class's internals.

## Empirical lift on codegraph (limit=10, default cap=3)

| Query | Before (max from one file) | After |
|---|---:|---:|
| ExtractionOrchestrator | 10/10 | 9/10 (only one file matches; backfill kicks in) |
| database | 8/10 | 3/10 |
| config | 5/10 | 3/10 |
| resolve | 4/10 | 3/10 |
| extract / parse | 3 (no regression) | 3 |

Top-1 result is preserved in every case — diversification only
reorders second-and-onward.

## Components

- `SearchOptions.perFileCap?: number` — default 3; 0 disables.

- `diversifyByFile(results, limit, perFileCap)` in
  src/search/query-utils.ts: pure function. First pass picks at most
  perFileCap per file in score order. If limit isn't yet filled,
  backfills from skipped (in original score order) so we never return
  fewer results than the caller requested.

- searchNodes wires it after the existing rescoring pass, when there
  are more candidates than the caller's limit. Relies on the existing
  5x internal over-fetch in searchNodesFTS for headroom — no new
  multiplier added (multiplier-on-multiplier composition was the
  reviewer's blocking concern in an earlier draft).

## Files changed

| File | Change |
|---|---|
| src/types.ts | Add perFileCap to SearchOptions |
| src/search/query-utils.ts | Add diversifyByFile pure helper |
| src/db/queries.ts | Wire diversifyByFile into searchNodes; comment on the over-fetch composition |
| __tests__/diversify.test.ts (NEW) | 13 regression tests |

## Test plan

- [x] npm test: 393/393 pass on macOS
- [x] npx tsc --noEmit clean
- [x] Bench script confirms the lift in the table above
- [x] Independent reviewer pass before pushing — addressed:
  - Multiplier-on-multiplier (4x outer * 5x inner = 20x for large
    limits): outer multiplier removed; inner over-fetch is sufficient.
  - Within-limit reorder: documented as intentional pure-function
    behavior; integration path correctly skips when results <= limit.
  - MCP exposure of perFileCap: deferred — default 3 is the desired
    new behavior; MCP can pick it up later if users want to tune.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…r writes

Three DB-layer improvements bundled in one PR.

## 1. Batch getNodesByIds — fix N+1 in graph traversal

QueryBuilder.getNodesByIds(ids[]) returns Map<id, Node> in one
round-trip (chunked at 500 for SQLite param-limit safety, cache-aware
so already-cached entries are served from memory and only the misses
hit SQL).

Replaces 9 N+1 loops in src/graph/traversal.ts:
- getCallersRecursive / getCalleesRecursive
- getTypeAncestors / getTypeDescendants
- findUsages
- getImpactRecursive (both inner loops: contains-children + dependents)
- findPath BFS frontier
- traverseBFS / traverseDFS neighbor expansion
- getChildren

Each previously did `getNodeById` per edge inside a loop; for a function
with N callers at depth D, that was N^D point reads per traversal.
Now: one IN-list query per traversal step. Expected 10-50x speedup on
deep / fan-out-heavy traversals (impact analysis on popular utilities,
call graphs of central modules).

## 2. insertNode cache invalidation — fix correctness bug

QueryBuilder.insertNode uses INSERT OR REPLACE INTO nodes. The LRU
nodeCache was invalidated by updateNode and deleteNode but NOT by
insertNode, so a re-indexed node (replacing a cached row) would still
serve the pre-replace version on next getNodeById until LRU eviction
pushed it out.

Now invalidates `nodeCache.delete(node.id)` at the top of insertNode
(after validation, before SQL — so failed-validation early-returns
don't churn the cache).

## 3. Auto-maintenance after bulk writes

DatabaseConnection.runMaintenance() runs:
  - PRAGMA optimize    (incremental ANALYZE; only re-analyzes tables
                        whose row counts changed materially since last
                        ANALYZE — without it, the SQLite query planner
                        has zero statistics on freshly-bulk-loaded
                        tables and can pick wrong indexes)
  - PRAGMA wal_checkpoint(PASSIVE)
                       (fold WAL pages back into the main DB file so
                        the WAL doesn't grow unboundedly between
                        automatic checkpoints which fire at 1000 pages)

Both are non-blocking and silently swallowed on failure — best-effort,
never load-bearing for correctness. Wired into indexAll (when files
were indexed) and sync (when files changed). No-op when nothing
happened.

## Files changed

| File | Change |
|---|---|
| src/db/queries.ts | Add getNodesByIds; invalidate cache in insertNode |
| src/db/index.ts | Add runMaintenance() helper |
| src/graph/traversal.ts | Replace 9 N+1 loops with batch lookups |
| src/index.ts | Call runMaintenance after indexAll/sync |
| __tests__/db-perf.test.ts (NEW) | 9 regression tests |

## Test plan

- [x] npm test: 388/389 pass on macOS (one pre-existing fs.watch flake)
- [x] npx tsc --noEmit clean
- [x] Independent reviewer pass before pushing — APPROVE; one info
      finding addressed (getChildren was the 9th N+1 site, now batched
      too)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…write

Three watcher tests wrote files immediately after `watcher.start()`,
hitting a race with macOS FSEvents (and Linux inotify): there's a small
but real latency between `fs.watch()` returning and the kernel actually
delivering events. Under parallel test load (when the host CPU is busy
running other test files in worker threads), that latency balloons and
the file-change event is dropped before the watcher is fully registered.
Result: the test waits 5s for a sync that never fires and times out.

Affected:
  - debounced sync > should trigger sync after file change
  - callbacks > should call onSyncComplete after successful sync
  - callbacks > should call onSyncError when sync throws
  - debounced sync > should debounce rapid changes (less affected; its
    50ms-spaced loop incidentally settles, but explicit is better)
  - CodeGraph integration > should auto-sync when files change

Other tests in the same file already had a 400ms settle delay with a
comment ("Let watcher settle — fs.watch may fire residual events from
beforeEach") in the filtering tests. This PR factors that out into a
`letWatcherSettle()` helper and applies it consistently to every test
that writes immediately after `start()`.

No production code change. The flake had no user impact — it was purely
a test-order artifact under parallel load.

## Verification

  - Pre-fix: ran `npm test` 8+ times across this session — fs.watch flake
    fired in roughly 1 of 3 runs under parallel load.
  - Post-fix: 3 consecutive `npm test` runs, 380/380 each, no flakes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…om a diff

Adds a new MCP tool that takes a unified diff and returns structured
review context for an LLM-driven PR reviewer. Codegraph becomes a
substrate for Greptile/CodeRabbit-style products without itself doing
the synthesis.

## What it returns

Per changed file:
  - status (added / modified / deleted / renamed)
  - hunks (line ranges)
  - affected symbols (line-range overlap with hunks)
  - tests covering the file (via PR colbymchenry#106 `tests` edges; graceful no-op
    if those edges aren't present)

Per affected symbol:
  - signature, docstring
  - top-N callers (PR colbymchenry#108's batch lookups make this fast)
  - top-N callees
  - impact-radius node count

For deleted files:
  - brokenIncomingRefs — every distinct caller of a vanishing symbol
    (deduped by source name + file + line)

Co-change warnings (the genuinely novel signal):
  - For each changed file, surface up to N historical co-changers
    that were NOT touched in this PR (catches "you changed schema.sql
    but didn't update migrations.ts" coupling violations)
  - Graceful degrade if PR colbymchenry#105's co_changes table isn't present

Output is JSON; the LLM consumer does the synthesis (writes review
comments, decides severity, posts to GitHub).

## Components

- src/review/diff-parser.ts — pure unified-diff parser. Handles
  modified, added (/dev/null in ---), deleted (/dev/null in +++),
  renamed (rename from/to), multi-hunk files, single-line hunks
  (no comma in @@), C-style-quoted paths with spaces, and hunk-less
  rename / mode-change files mid-diff.

- src/review/index.ts — buildReviewContext(diff, queries, traverser).
  Iterates files, maps hunks to symbols, attaches per-symbol context,
  surfaces co-change warnings.

- src/index.ts — CodeGraph.buildReviewContext(diff, options) public API.

- src/mcp/tools.ts — codegraph_review_context tool definition + handler.
  JSON output runs through serializeReviewContextWithinCap which
  progressively trims (docstrings → signatures → caller/callee caps →
  drop callers/callees → drop oldest files → summary-only) so the
  result stays valid JSON even when it would exceed MAX_OUTPUT_LENGTH.

## Files changed

| File | Change |
|---|---|
| src/review/diff-parser.ts (NEW) | Pure unified-diff parser |
| src/review/index.ts (NEW) | buildReviewContext + co-change warnings |
| src/index.ts | Add CodeGraph.buildReviewContext public API |
| src/mcp/tools.ts | New tool + handler + JSON-safe truncation |
| __tests__/review-context.test.ts (NEW) | 25 regression tests |

## Test plan

- [x] npm test: 405/405 pass on macOS
- [x] npx tsc --noEmit clean
- [x] Independent reviewer pass before pushing — addressed four findings:
  - Mid-diff hunk-less rename / mode-change was silently dropped
    (request_changes → fixed; flushHunkless on every header transition,
    flags reset after consumption to prevent phantom emits)
  - truncateOutput sliced JSON mid-string producing invalid JSON
    (request_changes → tier-and-trim helper, regression test asserts
    output stays parseable)
  - brokenIncomingRefs could double-list the same caller with multiple
    edge types (info → dedup by name|filePath|line, regression test added)
  - C-style-quoted paths in `diff --git "a/x" "b/y"` headers were not
    stripped (info → regex now matches both shapes, regression test added)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds opt-out one-line LLM-generated summaries for indexed symbols when
a local OpenAI-compatible endpoint (default Ollama on
http://localhost:11434/v1) is reachable. No config required: detection
runs lazily on first index/sync and is cached per CodeGraph instance.

Architecture:
- HTTP-only LLM client; the model runs in a separate process so the
  WASM/ONNX failure mode that motivated PR colbymchenry#87 cannot recur.
- Schema v4 adds symbol_summaries(node_id, content_hash, summary,
  model, generated_at) keyed on node_id with content_hash invalidation.
- summarizeAll runs as a background fire-and-forget after indexAll/sync
  with AbortController-based cancellation on close().
- Dirty flag re-queues a follow-up pass when sync fires mid-pass so
  newly indexed symbols don't have to wait for the next sync.
- Path-traversal guard reuses validatePathWithinRoot for body reads.

Surfaces:
- CodeGraph.startBackgroundSummarization, awaitBackgroundSummarization,
  getEffectiveLlmConfig, getSymbolSummaries, getSummaryCoverage.
- CLI index/sync await background pass interactively (Ctrl-C to skip);
  --quiet (git hooks) explicitly disables summarisation to keep hooks
  fast. status command surfaces detected endpoint + coverage %.
- MCP formatSearchResults / formatNodeDetails enrich output with the
  cached summary line, threaded through the per-request CodeGraph so
  cross-project lookups stay correct.

Tests: 13 new in __tests__/llm.test.ts covering reachability, model
preference, content_hash cache, cancellation on close, dirty-flag
re-queue, no-LLM-reachable path. Full suite 392/392.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tier 1 #1, #2, and Tier 2 #6 of the LLM enrichment roadmap. Builds
on PR colbymchenry#111's symbol summaries:

Phase 0 — embeddings infrastructure:
  - Schema v5 adds embedding BLOB + embedding_model on symbol_summaries
  - src/llm/embeddings.ts: embedAllSummaries pipeline, vectorToBytes /
    bytesToVector helpers, cosineNormalised for L2-normalised vectors,
    topKByCosine, reciprocalRankFusion (k=60 canonical)
  - QueryBuilder: getEmbeddableSummaries, getAllEmbeddings,
    upsertSymbolEmbedding
  - Background pass embeds summaries when an embedding model is
    configured/auto-detected (nomic-embed-text by default), runs
    sequentially after summarisation

Tier 1 #1 — semantic search:
  - CodeGraph.searchHybrid blends FTS5 lexical with cosine semantic
    via Reciprocal Rank Fusion. Falls back to FTS-only when
    embeddings aren't available so callers can use it universally.
  - MCP codegraph_search gains a `mode: 'hybrid'|'fts'` arg, defaults
    to hybrid

Tier 1 #2 — codegraph_ask (RAG Q&A):
  - src/llm/ask.ts: askWithCandidates pipeline. Hybrid retrieve →
    pick top MAX_FULL_BODIES for verbatim inclusion, list rest by
    name+summary → strict prompt that demands citations and refuses
    to invent missing context
  - CodeGraph.ask wraps it with reachability check
  - New MCP tool: codegraph_ask
  - New CLI: `codegraph ask "question"` with sources block

Tier 2 #6 — cross-language matching:
  - CodeGraph.findSimilar(nodeId) over the embedding space with
    sameLanguage / differentLanguage filters
  - New MCP tool: codegraph_similar
  - Useful for polyglot repos: find the JS impl that mirrors the Go
    one, etc.

Privacy / opt-out unchanged from PR colbymchenry#111: HTTP-only client, no
in-process inference, no native deps, no external endpoints probed.

Tests: 9 new in __tests__/embeddings.test.ts cover BLOB round-trip,
cosine, RRF, fallback path, end-to-end summarise+embed pipeline,
hybrid blending, findSimilar with language filters. Schema v5 test
expectations bumped. 401/402 passing (1 pre-existing fs.watch flake).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Builds out the rest of the LLM enrichment roadmap on top of the
semantic-search/RAG foundation. All features remain fully optional;
codegraph behaves identically when no LLM is reachable.

Tier 1 #3 — directory/module summaries:
  - Schema v6 adds directory_summaries(dir_path PK, summary,
    content_hash, model, generated_at)
  - src/llm/dir-summarizer.ts aggregates symbol summaries per dir into
    one paragraph (≤600 chars), keyed by stable content_hash; min 3
    summarised symbols per dir as the threshold for "worth a paragraph"
  - Wired into background pass after summaries+embeddings
  - New MCP tool codegraph_module (per-dir lookup, or list-all when
    dirPath omitted)

Tier 2 #4 — change-intent helper:
  - src/llm/change-intent.ts: standalone summarizeChange(client, name,
    kind, beforeBody, afterBody). Three modes (added | removed |
    modified) with shape-appropriate prompts.
  - CodeGraph.summarizeChange wrapper; no caching at this layer
    (PR-review tooling owns lifecycle), no MCP/CLI surface yet.

Tier 2 #5 — role classification:
  - Schema v7 adds role + role_model columns on symbol_summaries plus
    idx_summaries_role
  - Closed label set: api_endpoint | business_logic | data_model |
    util | framework_glue | test_helper | unknown
  - parseRole tolerates punctuation, fences, and multi-word
    "Business Logic" variants by snake_casing as a fallback
  - Wired into background pass after dir summaries
  - New MCP tool codegraph_role (filter by label); CodeGraph methods
    findNodesByRole, getRoleCounts

Tier 3 #7 — dead-code judge:
  - SQL pre-filter findOrphanedSymbols: kind in {function|method|
    class|component} + is_exported=0 + zero incoming `calls` edges,
    with path-pattern exemptions for tests/scripts/bench
  - LLM judge returns JSON {verdict: dead|live|uncertain, confidence,
    reason}; tolerant fence-stripper (incl. stray backticks) and
    confidence clamp [0..1]; results sorted dead-first then by
    confidence
  - New MCP tool codegraph_dead_code with verdict filter

Tier 3 #8 — naming-drift checker:
  - sampleSiblingNames pulls ≤30 same-kind names from other files
  - LLM judge returns {consistent, suggestion, reason}; defaults to
    "consistent" on parse failure or <5 siblings (advisory, not
    enforcement)
  - CodeGraph.checkNamingDrift wrapper; designed for sync-hook
    callers, no MCP surface yet

Agent-as-LLM bridge (NEW — works without any local LLM):
  - src/llm/agent-bridge.ts exposes a contract for the agent in the
    user's session (Claude Code, etc.) to fill the summary cache
    directly when no Ollama is available.
  - codegraph_pending_summaries returns un-summarised symbols with
    bodies + content_hash for the agent to summarise.
  - codegraph_save_summaries persists agent results, re-validating
    content_hash against current disk to reject stale answers.
  - Cache shape is identical to the local-LLM path; the two coexist
    cleanly.
  - Reuses validatePathWithinRoot for body reads; same
    SUMMARIZABLE_KINDS / MIN_BODY_LINES filter as local pass for
    cache-key parity.

Reviewer-flagged fixes (from independent review of the tier work):
  - dir-summarizer cache-hit branch no longer double-increments `done`
    (the `finally` always runs on `continue`)
  - classifier parseRole gains snake_case fallback so "Business Logic"
    and similar multi-word responses no longer silently degrade to
    "unknown"
  - dead-code parseJudgeResponse strips stray backticks before
    JSON.parse so fenced multi-line responses don't fall back to
    uncertain
  - Post-chat abort guards added to summarizer, dir-summarizer,
    classifier, dead-code: writes after a close()-cancelled chat are
    now skipped instead of persisted as stale entries
  - dir-summarizer module doc acknowledges prompt-injection accepted
    risk (data originates from user's own codebase)
  - dead-code test replaces tautological assertion with meaningful
    judged ≤ candidates / errors == 0 checks
  - parseRole exported for direct testing; new test covers canonical,
    fenced, multi-word, and trailing-punct inputs

Tests: 7 new in __tests__/llm-tiers.test.ts
  - Background pass writes both directory summaries and role labels
  - summarizeChange added/removed/modified
  - Dead-code judge returns parsed verdicts in valid range, no errors
  - parseRole handles all four input shapes
  - Agent bridge round-trip without LLM (pending → save → coverage
    reflects writes; second batch short-circuits on cache)
  - Agent bridge stale-content-hash rejection

Schema migrations cover v4 → v5 → v6 → v7 cleanly. Test schema
version expectations bumped. Full suite 409/409 passing including
the watcher tests that were previously flaky.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
upsertDirectorySummary bound parameters in (dir_path, contentHash,
summary, model, ...) order while the INSERT columns were declared in
(dir_path, summary, content_hash, model, ...) order, so paragraphs
landed in content_hash and 32-char hex landed in summary. Caught
running the end-to-end demo when codegraph_module returned hex blobs
instead of paragraphs.

Adds a regression test that asserts no summary matches /^[0-9a-f]{32}$/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adding a new language used to require coordinated edits to 6
shared lists across 4 files (Language union in types.ts;
DEFAULT_CONFIG.include; WASM_GRAMMAR_FILES, EXTENSION_MAP, and
getLanguageDisplayName in grammars.ts; EXTRACTORS map in
languages/index.ts). Two PRs adding different languages typically
conflicted on every one of those.

After this refactor, adding a new language is:

  1. Drop a file at src/extraction/languages/<name>.ts exporting an
     <NAME>_DEF: LanguageDef constant.
  2. Add ONE import line and ONE array entry to
     src/extraction/languages/registry.ts (alphabetical position —
     adjacent additions are still possible but rare).

That's it. grammars.ts, types.ts, tree-sitter.ts dispatch, and the
default include globs are all derived from the registry.

## What's in a LanguageDef

```ts
interface LanguageDef {
  name: string;                  // canonical id
  displayName: string;           // "Pascal / Delphi"
  extensions: readonly string[]; // ['.pas', '.dpr', ...]
  includeGlobs: readonly string[];
  grammar?: { wasmFile, vendored?, extractor };  // tree-sitter
  customExtractor?: (fp, src) => ExtractionResult;  // Liquid, Svelte
  extensionOverrides?: { '.dfm': { customExtractor } };  // Pascal forms
}
```

Each existing language file now exports both its `xxxExtractor`
(unchanged) AND a new `XXX_DEF`. New files were added for tsx, jsx,
svelte, liquid (the latter two wrap their existing custom extractor
classes via the customExtractor field).

## Refactored consumers

- src/extraction/grammars.ts: WASM_GRAMMAR_FILES removed (was
  internal-only); EXTENSION_MAP now a Proxy that lazy-builds from
  the registry on first access (avoids TDZ in cyclic load paths).
  loadGrammarsForLanguages, isLanguageSupported, isGrammarLoaded,
  getSupportedLanguages, getLanguageDisplayName, detectLanguage —
  all read from registry.
- src/extraction/tree-sitter.ts: extractFromSource's if-chain
  (svelte / liquid / pascal+.dfm/.fmx) replaced with one lookup:
  def.extensionOverrides[ext]?.customExtractor || def.customExtractor.
  Drops direct imports of LiquidExtractor, SvelteExtractor,
  DfmExtractor.
- src/types.ts: DEFAULT_CONFIG moved to src/default-config.ts (cycle
  break). types.ts re-exports for backward compat. The `include`
  array is now built lazily from each LanguageDef's includeGlobs.

## What still requires a one-line edit

The Language string union in types.ts still hard-codes the known
languages (typescript | javascript | … | unknown). New languages
added to the registry work at runtime as strings, but adding the
literal here is required IF the resolver wants to do exhaustive
narrowing on the new language (resolution/index.ts and
resolution/import-resolver.ts have a few `language === 'X'`
branches). Most new languages don't need such branches.

This trade-off keeps strict narrowing for the existing handful of
language-specific code paths while making everything else
registry-driven.

## Tests

380/380 pass. No new tests; behavior is identical. Existing
extraction.test.ts and pr19-improvements.test.ts heavily exercise
detectLanguage, isLanguageSupported, getSupportedLanguages, and
loadAllGrammars — all green.

## Follow-ups (out of scope)

- Auto-discovery in registry.ts via fs.readdirSync — works in
  built dist/ but vite-node doesn't support extensionless require()
  of TS source. A small build-time generator could remove the
  static import list entirely.
- Splitting __tests__/extraction.test.ts into per-language test
  files — eliminates the test-end-of-file conflict surface that
  every language PR currently hits.
- Similar registry refactors for:
  - MCP tool definitions (each tool self-registers; no shared
    tools[] array or case-switch in execute())
  - Migration files (each migration in src/db/migrations/NNN-*.ts;
    auto-discovered by version)
  - Index/sync hooks (centrality, churn, issue-history,
    config-refs, sql-refs, cochange all currently mutate
    CodeGraph.indexAll/sync; an IndexHook interface would make
    each pass self-contained)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tractor

Reviewer caught a real bug: the original commit kept the
EXTRACTORS map in src/extraction/languages/index.ts as a separate
hand-curated registry that TreeSitterExtractor read from. Adding
a new grammar-backed language would have required editing
EXTRACTORS too, undermining the refactor's stated single-source-of-
truth claim. A future contributor missing the EXTRACTORS update
would silently produce empty extraction results.

Fix:
- TreeSitterExtractor now reads its extractor straight off the
  language def: getLanguageDefByName(this.language)?.grammar?.extractor
- EXTRACTORS in languages/index.ts becomes a Proxy that derives
  lazily from the registry (kept for backward compat — readers
  unchanged).
- Add 16 structural-invariant tests in __tests__/language-registry.test.ts
  that fail loudly if any derived consumer drifts from the registry:
  EXTRACTORS / EXTENSION_MAP / detectLanguage / isLanguageSupported /
  getSupportedLanguages / getLanguageDisplayName all asserted to
  exactly mirror the registry contents.

Adding a new grammar-backed language is now genuinely "one new file
+ two lines in registry.ts" — no other files to touch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…flicts

Today every PR adding an MCP tool conflicts on the same two
shared lists in src/mcp/tools.ts: the tools[] array (the
list_tools surface) and the case switch in execute(). After this
refactor:

  Adding a new MCP tool:
  1. Drop a file at src/mcp/tools/<name>.ts exporting a
     <NAME>_TOOL: ToolModule (definition + handlerKey).
  2. Add one import line and one array entry to
     src/mcp/tools/registry.ts.
  3. Implement handle<Name>(args) on ToolHandler in tools.ts and
     add the new key to HandlerKey in tools/types.ts.

Step 3 is the only remaining "shared method on a single class"
conflict surface. Extracting handler bodies into per-tool files
(making step 3 also a single-file addition) is left as a
follow-up — the cost/benefit favors landing this incremental win
now and finishing the body extraction once language and migration
refactors land.

## What's new

- **src/mcp/tool-types.ts** — extracted ToolDefinition, ToolResult,
  PropertySchema, projectPathProperty into a shared module so
  per-tool files can import without circular dependency.
- **src/mcp/tools/types.ts** — ToolModule interface, HandlerKey
  string union, and ToolHandlerLike (a structural type that
  ToolHandler now `implements`, providing compile-time guarantee
  that every HandlerKey maps to a real method).
- **src/mcp/tools/<name>.ts × 9** — one file per existing tool
  (callees, callers, context, explore, files, impact, node, search,
  status). Each ~25-30 lines: import + definition literal +
  handlerKey reference.
- **src/mcp/tools/registry.ts** — static-import barrel, sorted
  alphabetically. Exports getToolModules(), getToolModule(name),
  and the derived `tools[]` array.
- **src/mcp/tools.ts** — ~200 lines deleted from the top
  (inline types + tools[] array + projectPathProperty).
  execute()'s case-switch replaced with a registry lookup +
  type-safe `this[mod.handlerKey](args)` dispatch (now compile-
  time-checked thanks to `implements ToolHandlerLike`).
  All `private async handle*` methods now public to match the
  interface. errorResult/textResult also public for the same reason.
- **src/mcp/index.ts** — MCPServer's tool-existence check switched
  from a linear `tools.find()` scan to the O(1) `getToolModule()`
  Map lookup, eliminating two parallel lookup paths.

## Tests

387/387 pass. **7 new tests** in __tests__/mcp-tool-registry.test.ts:
- Definitions are well-formed (name shape, description length).
- handlerKey shape (`handle<UpperCase>`).
- Every registered handlerKey resolves to a real method on
  ToolHandler.
- Exported `tools[]` exactly mirrors the registry.
- Canonical 9 main-line tools regression guard.
- execute() unknown-tool error path.
- **End-to-end dispatch smoke test**: execute('codegraph_status', {})
  reaches the real handler body (no broken `this` binding) — would
  fail loudly if the dynamic dispatch chain ever breaks.

## Reviewer pass

Independent reviewer ran once. 2 REQUEST_CHANGES + 2 INFO addressed:

1. ToolHandlerLike was defined but never enforced —
   ToolHandler now `implements ToolHandlerLike`. Eliminates the
   `(this as unknown as Record<...>)` cast in execute(); dispatch
   is fully compile-time-checked.
2. No end-to-end dispatch test — added one (see Tests above).
3. MCPServer.handleToolsCall used a linear `tools.find()` scan
   while execute() used Map lookup — switched to getToolModule()
   for parity.
4. Removed redundant .slice() in registry.ts (map() already
   returns a fresh array).

## Backward compat

src/mcp/tools.ts still re-exports ToolDefinition, ToolResult, the
mutable `tools[]` array, ToolHandler, and getExploreBudget. Every
existing consumer (`import { ToolDefinition, ToolResult, tools,
ToolHandler } from './tools'`) keeps working unchanged.

## Affected open PRs

- colbymchenry#110 (review-context): rebases to 1 new file in tools/ + 2
  lines in registry.ts + 1 method on ToolHandler + 1 line in
  HandlerKey.
- colbymchenry#112 (centrality+churn): same shape for the codegraph_hotspots
  tool.
- colbymchenry#114 (config-refs): same shape for codegraph_config.
- colbymchenry#115 (sql-refs): same shape for codegraph_sql.

Each goes from 4-way conflict (tools[] + case + handler + helpers)
down to 1-way conflict (HandlerKey + handler method on ToolHandler,
both in tools.ts).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Today every PR adding a schema migration claims
`CURRENT_SCHEMA_VERSION = next` AND adds an array entry to
`migrations: Migration[]` in src/db/migrations.ts. Two PRs both
claiming the same version resolve as: "second PR's v4 silently
no-ops on existing DBs" — a real silent-data-loss bug class
(PR colbymchenry#113's reviewer caught one).

After this refactor:

  Adding a new schema migration:
  1. Pick the next free 3-digit prefix (`git ls-files
     'src/db/migrations/[0-9]*.ts'` shows what's taken).
  2. Create `src/db/migrations/<NNN>-<short-name>.ts` exporting a
     `MIGRATION: MigrationModule` (description + up).
  3. Add one import line and one entry to
     `src/db/migrations/index.ts`'s REGISTERED_MODULES array.

Two PRs both creating `004-foo.ts` collide on the FILESYSTEM —
the maintainer sees it instantly. No more silent skipped
migrations.

## What's new

- `src/db/migrations/types.ts` — `MigrationModule { description,
  up }` and `Migration extends MigrationModule { version }`.
- `src/db/migrations/002-project-metadata.ts` — extracted v2
  body verbatim.
- `src/db/migrations/003-lower-name-index.ts` — extracted v3
  body verbatim.
- `src/db/migrations/index.ts` — central registry. Static-imports
  each migration, parses the version FROM THE FILENAME (no
  hand-typed version field that can drift), enforces strict
  `NNN-kebab-name.ts` shape, validates uniqueness/sort at module
  load (throws loudly on collision), exposes ALL_MIGRATIONS and
  CURRENT_SCHEMA_VERSION.
- `src/db/migrations.ts` — refactored to a thin runner. Same
  exported surface (CURRENT_SCHEMA_VERSION, getCurrentVersion,
  runMigrations, needsMigration, getPendingMigrations,
  getMigrationHistory, Migration type) — every existing import
  keeps working unchanged.
- `__tests__/migrations-registry.test.ts` — 8 invariant tests:
  registry non-empty, versions unique + strictly ascending,
  CURRENT_SCHEMA_VERSION matches max, every file matches the
  strict NNN-kebab-name pattern, no orphan files, no phantom
  registrations.

## Reviewer pass

Independent reviewer ran once. 3 REQUEST_CHANGES + 1 INFO addressed:

1. Hand-typed `version` field in REGISTERED_MODULES could drift
   from filename. **Fixed**: removed the version field; registry
   now parses version from filename via FILENAME_PATTERN regex
   inside validateRegistered.
2. Filename-pattern test was lenient (allowed 4-digit or 1-digit
   prefixes). **Fixed**: new "every migration file matches the
   strict NNN-kebab-name.ts pattern" test catches malformed
   filenames as orphan-detection-bypassing offenders.
3. `getPendingMigrations` returned `readonly Migration[]`,
   breaking callers that typed the result as `Migration[]`.
   **Fixed**: returns a fresh mutable array via `.slice()`.
4. No throw-on-duplicate test for validateRegistered (module
   evaluation timing). Acknowledged; not added.

## Backward compat

Every existing import works unchanged:
- `import { CURRENT_SCHEMA_VERSION } from './migrations'` ✓
- `import { runMigrations } from './migrations'` ✓
- `import { needsMigration } from './migrations'` ✓
- `import { getMigrationHistory } from './migrations'` ✓
- `import { getPendingMigrations } from './migrations'` — returns
   mutable Migration[] (preserved)
- `Migration` type — re-exported

## Affected open PRs

Every migration-touching PR (colbymchenry#102 UNIQUE edges, colbymchenry#105 cochange,
colbymchenry#108 perf db, colbymchenry#111 LLM features, my colbymchenry#112 centrality+churn, colbymchenry#113
issue-history, colbymchenry#114 config-refs, colbymchenry#115 sql-refs) currently
claims migration v4 and conflicts with each other on
`migrations.ts`. After this lands they each become:
- 1 new file: `src/db/migrations/<NNN>-<name>.ts`
- 2 lines in registry.ts (import + array entry)

Conflict shape changes from "next free version + array entry +
CURRENT_SCHEMA_VERSION bump in one file" (4-way conflict) to "1
new file" + 2-line registry edit. If two PRs target the same
NNN, the filesystem collision surfaces immediately — no silent
skipped migrations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Today every PR adding a derived-signal pass (centrality, churn,
issue-history, config-refs, sql-refs, cochange) edits the same
3 spots in src/index.ts:

  1. New imports at the top
  2. New private method on `CodeGraph` (e.g. runDerivedSignals,
     runIssueHistoryPass, runConfigRefsPass, runSqlRefsPass)
  3. New call site in `indexAll` AFTER resolution
  4. New call site in `sync` AFTER resolution

5 PRs collide on every one of those.

After this refactor:

  Adding a new derived-signal pass:
  1. Create `src/index-hooks/<name>.ts` exporting a
     `HOOK: IndexHook` constant with `afterIndexAll` and/or
     `afterSync` methods.
  2. Add one import + one entry to
     `src/index-hooks/registry.ts`.

`CodeGraph.indexAll` and `sync` invoke the hook runner once;
adding a new pass touches only the hook file + the registry.
Zero changes to CodeGraph itself.

## What's new

- **src/index-hooks/types.ts** — `IndexHook` interface
  (`afterIndexAll`, `afterSync`, both optional), `IndexHookContext`
  (projectRoot + config + queries + db), and
  `IndexHookOutcome` for diagnostic reporting.
- **src/index-hooks/registry.ts** — static-import list of every
  registered hook (empty on main today; PRs adding hooks fill it
  in), plus the `runAfterIndexAll` / `runAfterSync` runners that
  iterate hooks and catch errors so one broken hook never fails
  indexing.
- **src/index.ts** — `indexAll` calls `runAfterIndexAll(ctx)`
  after resolution. `sync` calls `runAfterSync(ctx, result)`
  after resolution. New private `buildHookContext()` helper
  exposes a stable read-only context.
- **__tests__/index-hooks.test.ts** — 6 tests covering empty
  registry, runner shape, and the `afterIndexAll` / `afterSync`
  contracts.

## Why ship the framework on main with zero registered hooks?

The only consumers of this framework today are 5 unmerged PRs
(colbymchenry#105 cochange + my colbymchenry#112-colbymchenry#115). Landing the framework now lets
each of those PRs rebase to a 2-line change instead of 8-10
lines mutating CodeGraph adjacent-line. Without this, all 5 PRs
collide on the same indexAll/sync call sites.

The framework adds zero behavior on main (no registered hooks =
no-op runner). 380→386 tests confirm no regression.

## Affected open PRs

| PR | Today | After this lands |
|---|---|---|
| colbymchenry#105 cochange | runDerivedSignals helper + 2 call sites | 1 hook file in src/index-hooks/ + 2 lines in registry.ts |
| colbymchenry#112 centrality+churn | same shape | same shape |
| colbymchenry#113 issue-history | same shape | same shape |
| colbymchenry#114 config-refs | same shape | same shape |
| colbymchenry#115 sql-refs | same shape | same shape |

Each goes from "edit CodeGraph in 4 spots" to "drop a hook file."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…efactors

Lands centrality (PageRank) and churn (git history) as registered
IndexHooks (`afterIndexAll` + `afterSync`) instead of CodeGraph
private methods. Adds:

- Migration 004: nodes.centrality + files.{commit_count,loc,
  first_seen_ts,last_touched_ts} + indexes
- src/centrality/ + src/churn/ (pure modules)
- src/index-hooks/centrality.ts + churn.ts (registered hooks)
- CodeGraph public methods: getCentrality, getTopCentralNodes,
  getCentralityRank, getFileChurn, getHotspots
- codegraph_hotspots MCP tool wired through ToolModule registry
  + handleHotspots on ToolHandler
- Updated regression-guard tests (index-hooks, mcp-tool-registry)
  to reflect newly registered hooks/tools

Tests: 440/440 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv and others added 30 commits April 27, 2026 18:10
# Conflicts:
#	src/extraction/index.ts
#	src/utils.ts
# Conflicts:
#	__tests__/foundation.test.ts
#	__tests__/pr19-improvements.test.ts
#	src/db/migrations.ts
#	src/db/schema.sql
#	src/utils.ts
Originally based on monolithic grammars.ts/tree-sitter.ts; rebased
to the per-language registry pattern (PR colbymchenry#116):
- src/extraction/languages/hcl.ts (LanguageDef with grammar+custom)
- 'hcl' added to Language union
- Vendored tree-sitter-hcl.wasm
- HclExtractor (custom block extractor)
- 220 extraction tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/extraction/languages/r.ts (rExtractor + R_DEF)
- 'r' added to Language union
- Vendored tree-sitter-r.wasm
- Existing extraction tests for HCL preserved + R tests added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/extraction/languages/sql.ts (LanguageDef with grammar+custom)
- 'sql' added to Language union
- Vendored tree-sitter-sql.wasm
- SqlExtractor (custom DDL/DML extractor)
- Extensions: .sql, .ddl, .dml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	__tests__/sync.test.ts
#	src/extraction/index.ts
#	src/types.ts
# Conflicts:
#	__tests__/foundation.test.ts
#	__tests__/pr19-improvements.test.ts
#	src/db/migrations.ts
#	src/db/queries.ts
#	src/db/schema.sql
#	src/index.ts
#	src/types.ts
# Conflicts:
#	src/index.ts
#	src/mcp/tools.ts
External PR (firehooper). Originally based on the monolithic
grammars.ts; rebased to per-language registry pattern.
- src/extraction/languages/scala.ts (scalaExtractor + SCALA_DEF)
- 'scala' added to Language union
- Vendored tree-sitter-scala.wasm
- Extensions: .scala, .sc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…istry

External PR (malo). Originally based on monolithic grammars.ts;
rebased to per-language registry pattern.
- src/extraction/languages/rescript.ts (rescriptExtractor + RESCRIPT_DEF)
- 'rescript' added to Language union
- Vendored tree-sitter-rescript.wasm
- Extensions: .res, .resi.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stress-test on the codegraph repo itself caught a false positive:
"the" appeared as a "table" because the SQL-verb pre-filter passed
on a docstring containing "drop docs/config files from the list" —
"drop" matched the DROP keyword check, then "from the" matched the
FROM <table> regex.

Fix: extend RESERVED_TABLE_NAMES with common English words (a, an,
the, of, to, in, is, it, for, this, that, these, those, with, by,
at) that can never legitimately be table names in production code.

Test fixture updated: `'SELECT * FROM a'` placeholder → realistic
`'SELECT * FROM users'`. Stress test passes on 3 real codebases
(codegraph 168 files, asf-platform 453 files, asf-dashboard 87
files) with the new rule.

Also adds scripts/stress-test.mjs — comprehensive harness exercising
every shipped feature beyond the happy path: index-hook framework,
FTS subwords + Porter, search diversification, centrality + churn
+ hotspots + cochange, issue-history, config-refs, sql-refs, tests-
edges, review-context, edges UNIQUE, HEAD-movement detection on a
synthetic repo, .codegraphignore on a synthetic repo. 18+ PASS
checks per codebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three quick experiments before committing to build any of them:

A. dead-exports vs ts-prune: simple graph query produces 138 false
   positives (mostly re-exports codegraph doesn't track as direct
   incoming edges). 81% recall against ts-prune --loose. Verdict:
   the cheap graph-query approach is too noisy; need either proper
   import-resolution edges or TS-compiler integration to be useful.

B. weighted PageRank: weights {calls:3, instantiates:2, references:1}
   produce 8/10 same names with 0.4 avg position shift. Marginal
   signal. Verdict: skip the build.

C. coverage integration: tested on a repo with real coverage data.
   Coverage corrects 8 false positives where convention-based
   tests-edges flagged untested but coverage shows actual coverage.
   No truly-low-coverage central files that convention missed.
   Verdict: useful as a v2 quality-of-life upgrade, not urgent.

These spikes saved ~10h of build work that would have produced
noisy or marginal results. The script files remain in
scripts/spikes/ as reproducible probes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
D. Risk-delta: tested on the last 20 real commits in this repo.
   Significant rank shifts (≥5 ranks) on 12/20 commits. New
   top-10 entries on multiple feature merges. Metric is
   informative across most non-trivial commits.
   Verdict: BUILD.

E. simulate-change: built a synthetic broken diff (rename function
   definition without renaming its 3 callers — classic refactor
   mistake). Ran codegraph_review_context against it. All 3
   callers surfaced via symbol-level analysis even though the
   diff parser couldn't extract the file path from the synthetic
   format. The existing tool already catches the breakage case
   simulate-change was meant to handle.
   Verdict: SKIP — review-context is sufficient.

Combined with prior spikes (A=skip, B=skip, C=v2), the build
list shrinks to: explain_risk + HTML report + risk-delta. ~10
hours total, down from the original 11-item plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spike A showed the underlying graph query was 45% precision (138 false
positives in 250 claims, mostly re-exports the graph doesn't track as
direct incoming edges). False positives at that rate erode trust in
every other audit finding, so the principled move is to drop the
section entirely.

Replace with a per-language tool table:
  TS/JS  → ts-prune, knip
  Python → vulture
  Go     → staticcheck, deadcode
  Rust   → cargo udeps, built-in dead_code lint
  Java   → pmd unusedcode rulesets
  C#     → Roslyn IDE0051

Includes a short paragraph explaining why codegraph defers — the graph
tracks calls and references but not the import-to-export linkage. This
is more useful to users than a finding at 45% precision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ge) rebased

Rebased onto post-refactor patterns. PR colbymchenry#111 adds LLM-mediated tier-3
enrichment: symbol summaries, embeddings + semantic search, directory
summaries, role classification, dead-code judging, change-intent,
naming check, and an agent-bridge so agents (no local LLM needed)
can do the LLM work themselves.

Migrations 011-014 (file-based pattern):
  011-symbol-summaries.ts      symbol_summaries table
  012-summary-embeddings.ts    + embedding BLOB + embedding_model
  013-directory-summaries.ts   directory_summaries table
  014-summary-roles.ts         + role + role_model

7 new MCP tools (ToolModule pattern):
  codegraph_pending_summaries  agent-bridge: pull batch
  codegraph_save_summaries     agent-bridge: persist with hash check
  codegraph_dead_code          graph-signal + LLM judge (the spike-A response)
  codegraph_role               filter by classified role
  codegraph_module             directory-level synthesis paragraph
  codegraph_ask                hybrid-retrieve + chat-model RAG
  codegraph_similar            semantic-similar via embeddings

LLM infrastructure (src/llm/, 11 files): client, summarizer,
embeddings, ask, dir-summarizer, classifier, change-intent,
dead-code, naming, agent-bridge, detect (auto-discovers Ollama et al).

Schema v14. 787/787 tests pass (3 new LLM test files included).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three measurements on a synthesized 50K-node, 250K-edge,
50K-embedding DB (realistic mid-size codebase shape):

F. Redundant indexes (idx_edges_source, idx_edges_target,
   idx_co_changes_a):
   - DB size: -17.6% (43.8 MB → 36.1 MB)
   - Bulk insert: 1.31× faster
   - Query latency for source-only / target-only queries:
     no regression (kind-prefixed indexes still cover them)
   Verdict: CLEAR WIN, no downsides.

G. Embedding storage split (inline vs separate table):
   - Summary-only scan: 3.34× faster on split (46ms → 14ms)
   - Summary+embedding scan: 1.11× cost on split (74ms → 82ms,
     extra JOIN)
   - DB size: split slightly larger (204 vs 196 MB)
   Verdict: NET POSITIVE — common-path 3.3× win outweighs
   rare-path 11% cost.

H. In-memory embedding cache for similarity search:
   - Cold (per-query SQLite fetch): 106ms avg
   - Warm (in-memory Float32Array): 25ms avg
   Verdict: 4.2× speedup, CLEAR WIN.

All three optimizations validated. ~4 hours total to ship.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…emory similarity cache

Three independently measurable wins on the LLM-tier data layer,
all validated up-front via spike before implementation
(scripts/spikes/spike-embedding-split.mjs).

F2 - drop idx_co_changes_a (migration 015)
  The (file_a, file_b) PRIMARY KEY index already covers
  WHERE file_a = ? via SQLite left-prefix scan, so the narrow
  idx_co_changes_a was dead weight. idx_co_changes_b (on file_b
  alone) is kept because the PK leads with file_a.

G - split embeddings into a dedicated table (migration 016)
  Moves the 768-dim Float32 BLOB out of symbol_summaries into a
  new symbol_embeddings table with FK + ON DELETE CASCADE.
  Spike measurement on a 50K-summary synthetic DB:
    - Summary-only scan (common path): 3.22x faster (46ms -> 14ms)
    - Summary+embedding scan (rare path): 1.12x cost penalty
    - DB size: ~4% larger (separate page chain)
  Net positive: the common path dominates real usage.

H - in-memory EmbeddingCache for similarity search
  EmbeddingCache decodes every embedding into a flat Float32Array
  matrix once and reuses it across queries. topKByCosineMatrix
  operates directly on the flat layout. Cache is invalidated on
  indexAll, sync, embedAllSummaries (when generated > 0), and
  clear() - anywhere new vectors land or the table is emptied.
    - Cold (per-query SQLite fetch + decode): 104ms avg
    - Warm (in-memory matrix): 24ms avg
    - 4.4x speedup with cache

Also fixes a pre-existing schema.sql inconsistency where
idx_co_changes_b was declared twice (harmless thanks to
IF NOT EXISTS, but confusing).

Test coverage:
  - __tests__/migrations-015-016.test.ts: upgrade-path and
    fresh-DB behavior for both new migrations.
  - __tests__/embeddings.test.ts: topKByCosineMatrix matches
    topKByCosine; EmbeddingCache hit/miss/invalidate/dim-mismatch.

794/794 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…igration 017

Brings colbymchenry#122's perf win onto the post-colbymchenry#118 file-based migration pattern so it can integrate with colbymchenry#123. Behavior identical to colbymchenry#122 against main: drops idx_edges_source and idx_edges_target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `node_coverage` table + ingestion pipeline + MCP tool so agents
can answer "is this function tested?", "what high-impact code has
zero coverage?", and "should I write a test before refactoring this?"
without leaving the graph.

Pieces shipped:

  - `src/coverage/lcov.ts` -- streaming-friendly lcov.info parser
    handling DA/BRDA records, CRLF endings, missing end_of_record,
    and Istanbul's negative hit-count sentinel for excluded lines.
  - `src/coverage/index.ts` -- ingestion orchestrator that maps
    report paths to indexed nodes (with monorepo-style suffix
    matching for reports rooted at the repo root) and rolls hits
    up to symbol spans.
  - migration 018 + schema.sql: `node_coverage` table keyed on
    (node_id, source) so multiple suites coexist.
  - QueryBuilder: `upsertNodeCoverage`, `getNodeCoverage`,
    `getCoverageRanked`, `getCoverageStats`, `clearCoverageSource`.
  - MCP tool `codegraph_coverage` with three modes (symbol /
    ranked / stats); the killer query is
    `mode=ranked, minCentrality=0.001, maxPct=0.5` -- high-impact
    untested code in one call.
  - CLI: `codegraph coverage <report> [path]` with --source and
    --clear flags.

Independent reviewer pass caught and fixed three issues before
shipping: (1) negative DA hit counts inflated totalLines for
Istanbul-excluded regions, (2) the `pct` SQL expression returns
null for zero-line spans and the TS type didn't reflect that,
(3) monorepo-style report paths (e.g. `packages/api/src/foo.ts`
when the project root is `packages/api/`) failed to match.

Tested:
  - 13 unit tests covering parser edge cases, span rollup,
    end-to-end ingestion, idempotency, monorepo path matching,
    and clearSource semantics.
  - 807/807 full suite pass.
  - 10/10 smoke checks against codegraph's own indexed source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR colbymchenry#124 referenced src/coverage/index.ts (in ingestCoverage's dynamic
import) and src/coverage/lcov.ts (imported by the test's parseLcov /
summariseSpan), but the directory was never committed. Root cause:
.gitignore had an unanchored "coverage/" rule meant for test output,
which silently swallowed src/coverage/ too. Anchor the rule to the
repo root ("/coverage/") and add the missing module.

Implements: lcov parsing (DA/BRDA, Istanbul -1 sentinel handling),
per-span rollup (lines + branches), and the ingestion orchestrator
with exact-then-longest-suffix path matching for monorepo cases.

All 824 tests pass; the previously-orphaned coverage.test.ts (13
tests) now exercises the real module instead of failing at import.
When called with a source argument, getCoverageStats correctly filtered
the numeric rollup but returned every known source in `sources[]`. This
surfaced in the MCP tool: `mode='stats', source='unit'` would render
"Sources: unit, e2e" alongside unit-only numbers.

Apply the same WHERE clause to the DISTINCT source query and add a
regression assertion.
Five static-analysis biomarkers run on every indexed function and
method, persisted to a `code_health_findings` table the agent can
query directly. Killer query: `mode=ranked, minSeverity=warning,
minCentrality=0.001` -> high-impact code with structural problems
in one MCP call.

What ships:

  - Five biomarkers, McCabe-style cyclomatic + nesting + length:
    Large Method, Complex Method, Nested Complexity, Complex
    Conditional, Brain Method (the AND of the first three at
    warning+).
  - Per-language tree-sitter node-kind sets (TS/JS, Python, Go,
    Java today; languages without an entry are skipped).
  - 1-10 Code Health score aggregated from findings. Score 10 =
    no findings; -2/-1/-0.5 per error/warning/info.
  - `src/biomarkers/engine.ts` does the iterative DFS metric
    computation; `src/biomarkers/index.ts` orchestrates analysis
    across the project; `src/index-hooks/biomarkers.ts` wires it
    into indexAll/sync (only re-analyses changed files on sync).
  - Migration 019: `code_health_findings` keyed on
    (node_id, biomarker), FK CASCADE.
  - QueryBuilder: replaceFindingsForFile (atomic clear+insert),
    getFindingsForNode, getFindingsRanked, getFindingsStats.
  - MCP tool `codegraph_biomarkers` with three modes
    (symbol/ranked/stats) plus filters.
  - `enableBiomarkers` config flag (default on).

Independent reviewer pass caught and fixed three correctness bugs
before shipping: (1) stale findings persisted indefinitely when a
function was refactored clean -- now we always call
replaceFindingsForFile, even with an empty map; (2)
countConditionalOperands descended into nested function bodies,
inflating outer-conditional counts -- now stops at function-
container boundaries; (3) `else_clause` was in the cyclomatic
branching set, doubling every if/else pair away from McCabe's
definition -- now removed for TS/JS and Python.

Plus two info-level fixes: codeHealthScore import hoisted out of
the MCP hot path, and an explicit hint surfaces when a
minCentrality filter excludes everything because centrality
hasn't been computed yet.

Tested:
  - 17 unit tests covering metric computation, rule evaluation,
    Code Health aggregation, end-to-end indexAll persistence,
    sync-time stale-finding clearance, callback-isolation in
    conditional operands.
  - 811/811 full suite pass.
  - 8/8 smoke checks against codegraph's own indexed source --
    407 findings across 200 nodes, top hotspot is
    `visitNode` in src/extraction/tree-sitter.ts (Brain Method,
    cyclomatic 48, nesting 15, Code Health 3/10).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
findNodeAt() called parser.parse(source) on every invocation, and
analyseProject() called it once per analysable symbol. On a real
codebase that means re-parsing the same file dozens of times back-
to-back, which exhausts the WASM tree-sitter heap and triggers
RuntimeError: memory access out of bounds. Each crash silently drops
one symbol's finding.

On ollama/ollama@v0.22.0 (29k symbols, 1.2GB repo with vendored
llama.cpp / MLX-C):

  before: 3,066 WASM crashes, 3,104 symbols analysed, 1,483 findings, 7.6s
   after:     0 crashes,      6,217 symbols analysed, 3,056 findings, 1.6s

Refactors findNodeAt into two pieces:
  - parseSource(src, language) -> Tree | null  (call once per file)
  - findNodeInTree(tree, line, column) -> TsNode | null
The original findNodeAt is kept as a thin wrapper so existing
single-shot callers (and the public re-export) keep working unchanged.

analyseProject now parses once per file and walks the cached tree
for each symbol, eliminating the per-symbol re-parse entirely.
…, magic_number, hardcoded_url

Extends the biomarker engine with rules that look BEYOND per-method
complexity. Three operate on metrics gathered during the per-symbol
AST walk; one is a cross-file graph rule that runs once per analyseProject.

Per-symbol additions to SymbolMetrics:
  - paramCount         (counted from the function/method's parameter list)
  - magicNumberCount   (numeric literals not in the trivial allow-list 0/1/-1/2)
  - hardcodedUrlCount  (string literals matching http(s)|ws(s)|ftp|s3|gs URLs)

Rules:
  - long_parameter_list  thresholds info=4 / warning=5 / error=7 (Clean Code)
  - magic_number         info=3 / warning=5 / error=8
  - hardcoded_url        info on first occurrence, warn at 2, error at 3+
  - unused_export        single SQL query: nodes with is_exported=1 and no
                         non-`contains` incoming edge from outside their file.
                         Skipped on partial scans; uses appendFindings +
                         clearFindingsByKind so it doesn't clobber per-symbol
                         findings written earlier in the pass.

DB additions:
  - findUnusedExports()       single graph query
  - appendFindings(...)       insert without delete — needed by cross-file rules
  - clearFindingsByKind(...)  global wipe by biomarker name

Verified live on a synthetic stress fixture with one 7-param function
using 5 magic numbers and 3 dead exports:
  fetchAll → long_parameter_list error (7), magic_number error (10)
  unused1, unused2, consumeFetch → unused_export warning

Full test suite: 824 passed (up from 380 — pulls in biomarker tests).
The pre-existing "clean code = 0 findings" test fixture was updated
to use non-exported helpers so unused_export's warning doesn't fire
on the fixture (semantically those WERE unused exports — calibration
artefact, not a regression).
…rsistence, param kinds, transaction wrap, unit tests

Six fixes from independent semantic review of the prior commit:

- findUnusedExports SQL: also exclude exports/imports/tests edge kinds.
  Re-export barrels (export { x } from './a') created cross-file
  exports edges that masked dead exports as 'used'. Same for the
  import-statement edge kind and the convention-derived tests edge.
  Now only real-use edges (calls/references/instantiates/extends/etc.)
  count toward 'used'.

- replaceFindingsForFile: exclude unused_export from the per-file
  DELETE, so sync runs that touch a file no longer silently wipe
  its cross-file findings without recomputing them.

- countParameters: drop argument_list from PARAM_LIST_KINDS. That's
  a call-site node kind, not a declaration-site one — including it
  could spuriously inflate paramCount for grammars that surface it
  as a direct child of a function node.

- analyseProject unused_export: wrap clearFindingsByKind +
  appendFindings in queries.transaction so concurrent readers never
  see the intermediate empty state between DELETE and INSERT.

- evaluateRules tests: introduce a metrics() helper that defaults
  every SymbolMetrics field, so tests only specify the field they
  exercise and the suite stays robust as more metrics are added.

- New unit tests: long_parameter_list (warning at 5, none at 3),
  magic_number (error at 8, none at 2), hardcoded_url (info/warn/
  error tiers, none at 0). Brings the file to 830 tests passing
  (was 824).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants