feat(mcp): surface LLM provider routing in status/ask/similar [WIP 5/5] by andreinknv · Pull Request #7 · andreinknv/codegraph

andreinknv · 2026-05-01T11:21:30Z

Stress-test fixes — PR 5 of 5 (stacked). Base: pr4-gc-wasm-fixes.

Commits

56fac32 feat(mcp): surface LLM provider routing in status, ask trailer, and similar empty-state

Summary

Three formatter gaps surfaced by the 23-tool MCP benchmark on 9adac63:

codegraph_status gains a ### LLM providers block listing chat / askChat / embedding model + provider + endpoint. The Ask line only renders when askChat differs from chat, so single-provider configs stay clean. The Ask line does not borrow the chat block's endpoint when askChat omits one — claude-bridge has no HTTP endpoint, and falling back to the chat MLX URL was misleading. (This last bit was a cosmetic glitch caught while live-probing the original patch; the deterministic invariant is pinned by a new test.)
codegraph_ask trailer appends ; model `` so the agent can confirm whether ask routed to Sonnet or the local chat model without timing inference.
codegraph_similar 3-way empty-state distinguishes (a) no embedding model configured, (b) source has no embedding row for the configured model, (c) source embedded but no neighbours above threshold. Cites the model id in case (b).

Adds hasSymbolEmbedding(nodeId, model?) on the query layer + a CodeGraph passthrough so the similar handler can tell embedding absence from threshold misses.

Test plan

Suite green at 1113 / 13 skip / 0 fail.
Live MCP probe (post-restart) confirms Ask model: claude-sonnet-4-6 provider claude-bridge with NO @ http://... segment. Already verified 2026-05-01.

WIP draft.

…imilar empty-state Three formatter gaps surfaced by the 23-tool MCP benchmark on 9adac63: * `codegraph_status` gains a `### LLM providers` block listing chat / askChat / embedding model + provider + endpoint. The Ask line only renders when askChat differs from chat, so single-provider configs stay clean. The Ask line no longer borrows the chat block's endpoint when askChat omits one — claude-bridge has no HTTP endpoint, and falling back to the chat MLX URL was misleading. * `codegraph_ask` trailer now appends `; model `<id>`` so the agent can confirm whether ask routed to Sonnet or the local chat model without timing-based inference. * `codegraph_similar` distinguishes three empty-state branches: no embedding model configured, source has no embedding row for the configured model, and source embedded but no neighbours above threshold. Adds `hasSymbolEmbedding(nodeId, model?)` on the query layer + a CodeGraph passthrough so the similar handler can tell embedding absence from threshold misses. Six existing visibility tests + one new test pinning the no-fallback behavior on the Ask line. Suite: 1113 passed / 13 skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

andreinknv · 2026-05-01T11:31:13Z

Stack collapsed into wip-stress-base (now at 56fac32). Commits landed via the fast-forward of wip-stress-base from df96c6c to 56fac32; #3 shows as merged, #4-#7 are closed because their tips are already ancestors of the base. Branch preserved for reference.

… server-config flags Tooling-gap backlog (codegraph/docs/codegraph-tooling-gaps.md) closed: #1 freshness severity bucket — `classifyFreshness` with fresh|recent|stale|very_stale #2 allowStale flag — opt-in bypass for the heavy-drift gate, registry-injected schema #3 module format in status — `module-format.ts` parses package.json + tsconfig (JSONC-safe) #4 codegraph_imports tool + import-classifier — file/directory/bare/unresolvable filters #5 dynamic imports — extractor catches `import('…')` + `require('…')`, incl. template_string #6 build-context refs — new `build_context_refs` table for `__dirname` / `import.meta.*` #7 files.is_test flag — column populated by glob; surfaced in status as `(N test)` colbymchenry#11 summarize-also-embeds (discovered while dogfooding) — `cg.summarizeAll()` chains `embedAllSummaries`; new `cg.embedAll()` for embed-only path; CLI `codegraph embed` CLI/MCP alignment (5/32 → 33+/35): - 13 new CLI commands via `runViaMCP` shim: callers, callees, impact, node, similar, biomarkers, imports, help-tools, explore, hotspots, dead-code, config-refs, sql-refs, module-summary, role, coverage-query, pending-summaries, save-summaries, review-context - 7 new MCP tools: codegraph_imports, codegraph_embed, codegraph_summarize, codegraph_sync, codegraph_reindex, codegraph_coverage_ingest, codegraph_init, codegraph_uninit, codegraph_unlock, codegraph_affected MCP server-level operator config (`codegraph serve --mcp`): - --no-write-tools / --allow-stale-default / --disable-tool (sandboxing) - --llm-endpoint / --llm-chat-model / --llm-ask-model / --llm-embedding-model / --llm-api-key (operator LLM config; per-project config wins on conflict) - New CODEGRAPH_LLM_* env vars wired through `mergeLlmEnv` in resolveLlmProviders Architectural cleanups: - `bypassFreshnessGate` and `isWriteTool` declarative flags on ToolModule (replaces growing string-comparison chain in execute()) - `withAllowStale` registry injection only on tools that DO see the gate - DRY of inline copy-paste in 3 hooks → `src/index-hooks/enclosing.ts` - `LlmClient.isEmbeddingReachable` for split-provider correctness - SyncResult `lockContention` flag → handleSync emits distinct retryable message - `clearStructural` deletes from build_context_refs (was orphan-leaking on --force) - cli:dev npm script + tsx CLI fixed (web-tree-sitter `import type` for type-only refs) Migrations: 023-files-is-test.ts — add `files.is_test` 024-build-context-refs.ts — add `build_context_refs` table Reviewer rounds: 11 total, all REQUEST_CHANGES addressed inline. Notable fixes: - JSONC URL strip via state machine (was eating `https://` tails) - classifyFreshness very_stale now requires isStale (in-sync-but-old → recent) - Dynamic imports also match template_string nodes - process.exit deferred until after finally cleanup in runViaMCP - --same-language / --different-language mutual exclusion guard - help-tools CLI bypasses isInitialized (works without a project) - handleUninit sweeps projectCache by getProjectRoot (no dangling alias leaks) - handleAffected errors instead of silently dropping unsupported glob filters - mergeLlmEnv preserves precedence: legacy flat config wins over env-synthesised block Suite: 1268 passing, 1 expected red (colbymchenry#8 — undecided), 13 skipped, 1 todo, 0 regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pre-PR, every spawn of the independent reviewer subagent had to re-derive the same scrutiny areas and repo-specific conventions that prior reviewer passes had already taught me — wasted tokens and missed catches when I forgot to re-mention something. This adds `.claude/reviewer-memo.md`, a per-repo memo that captures the actually-recurring patterns from this session's reviews: Recurring scrutiny areas - Docstring rotting after a behavior change (caught twice this session: cross-file biomarker gate, parse_cache payload comment). - Counter accuracy when adding skip paths (stale-skips were bumping `errors` in summarizer + embedder — both caught). - SQL injection-shaped wildcards (GLOB metachar bypass on `excludeFile`). - Schema-version test forgetfulness (every migration breaks two hardcoded version assertions). - Schema entry only in migration, not in schema.sql (caught on 026-parse-cache: tests use fresh DBs which read schema.sql, not migrations). - Test environment mutation cleanup (env vars leak across tests on mid-setup throw). - Speculative / dead exports (YAGNI per repo CLAUDE.md). - Off-by-one between internal over-fetch and user-facing limit (scoreAndDiversify was getting cascadeLimit instead of limit). - CLI/MCP alignment claims (PR descriptions sometimes claim a CLI mirror that isn't actually wired). Repo-specific conventions - FK-safe upsert pattern: `INSERT ... SELECT ... WHERE EXISTS (...) ON CONFLICT DO UPDATE`, returning boolean. - Free-function clusters in src/db/queries-*.ts; don't re-merge into a class (god_class refactor escape). - Three-edit MCP tool registration discipline. - runViaMCP doesn't preload tree-sitter grammars; tools that use extractFromSource must defensively preload. - `cli:dev` script vs `npx codegraph` shadowing. - Standard verdict JSON shape. CLAUDE.md updated with a section telling the parent agent (me) to prepend the memo's content to every reviewer-subagent prompt. Without that wire, the memo is just a static doc — the parent agent is the conduit because the reviewer subagent doesn't auto-load files from the working repo. Smoke-test happens organically on the next reviewer pass (no need to manufacture one). When a NEW recurring pattern is caught (same finding shape in two separate diffs), append to the memo so the n+1th review starts richer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…enry#28) Two changes in one — they are tightly coupled (the dedup was motivated by the user catching that adding one language touched 6 places). Lua addition - src/extraction/wasm/tree-sitter-lua.wasm vendored from tree-sitter-wasms@0.1.13 (MIT). - src/extraction/languages/lua.ts — LanguageDef + LanguageExtractor patterned on bash.ts. Handles function_definition_statement (top-level + M.method + M:method), local_function_definition_statement, local_variable_declaration. Fixture under docs/test-beds/lua/fixture.lua extracts 11 symbols, 0 errors. v1 scope: top-level functions + table methods + locals. `require()` import-node promotion deferred — the visitor scaffolding is in place but doesn't fire for calls nested inside local declarations; documented as a known v1 gap. - src/extraction/languages/registry.ts adds LUA_DEF. - src/types.ts adds 'lua' to the Language union. - __tests__/pr19-improvements.test.ts: vendored-grammar count bumped 24 → 25. Dedup User caught that the registration surface had drifted: - **Real drift bug fixed**: src/search/query-parser.ts LANGUAGE_VALUES was an unchecked duplicate that had ALREADY drifted before this change — only listed 20 of 24 registered languages, so `lang:bash` / `lang:rescript` etc. silently fell through to FTS text instead of filtering. Replaced with `new Set(VALID_LANGUAGES.filter(l => l !== 'unknown'))`. VALID_LANGUAGES (config.ts) has the existing compile-time exhaustiveness check against the Language union, so future drifts on the search-filter set are now blocked at the build. VALID_LANGUAGES went from `const` to `export const`. - **JS-family helper**: 5 sites in resolution/import-resolver.ts (×2), resolution/index.ts, extraction/tree-sitter-decls.ts (×2) had the same hardcoded `lang === 'typescript' || lang === 'javascript' || lang === 'tsx' || lang === 'jsx'` disjunction. Replaced with a single `isJsFamily()` helper in src/utils.ts. resolution/index.ts's existing `isJsTsLanguage` is now a one-liner alias to keep call sites stable. After this commit, adding a new language touches 4 places: WASM file, extractor, registry entry, types.ts Language union. The 4th is intrinsic to TypeScript — the type system needs a literal union at compile time. The 1st-3rd are the irreducible implementation surface. Backlog scope decision Per user 2026-05-03: PowerShell / Solidity / Elixir descoped from B colbymchenry#28 — audiences too narrow to justify the per-language maintenance cost (grammar sync, extractor edge cases, fixture, eval footprint). Lua kept (Neovim configs, OpenResty, embedded scripting, game logic). Re-evaluate on demand. Reviewer pass — APPROVE with 2 info-level edge-case findings: - multi-param Lua signature could double-paren if grammar emits parens in `parameter_list` (fixture probe didn't surface this); worth a fixture-pinned signature assertion in a follow-up. - `readLuaStringLiteral` skips leveled long-brackets `[==[...]==]`; documented v1 gap. Sixth consecutive review where memo content (scrutiny areas #1 docstring rot and #7 speculative-export check) was load-bearing. Verification - npm run typecheck (tsgo) — clean. - npx vitest run — 1371 / 34 / 0 (unchanged). - End-to-end Lua extraction probe: 11 nodes, 10 edges, 0 errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pre-PR, `graphql-extractor.ts` explicitly skipped `type_system_extension` AST nodes ("intentionally skipped in v1 — merging extensions across files needs a second resolution pass we don't do yet"), so federation-style `extend type User { posts: [Post] }` produced zero nodes. Post-PR, each extension emits a separate node carrying the new fields/values plus an `extends` UnresolvedReference targeting the base type — cross-file merging reconstructible by walking the resolver-promoted edges. Mapping - `extend type X { … }` → class + extends ref - `extend interface X { … }` → interface + extends ref - `extend input X { … }` → class + extends ref - `extend enum X { … }` → enum + extends ref + new enum_members - `extend union X = …` → type_alias + extends ref + new union refs - `extend scalar X` → unsupported by tree-sitter-graphql 0.1.0 (parses as ERROR); defensive scaffold kept for a future grammar bump Per-line node-id derivation makes multi-extension cases distinct (`extend type User` at L5 and L20 both produce nodes named `User` of kind `class` with separate ids). Cross-file: filePath in the id-hash makes them unique by source location. Fields / enum values / union members go under the extension node, preserving "this field came from this extender" provenance. Known same-file edge case If a base definition and its extension live in the SAME file, the existing `findBestMatch` line-proximity may pick the extension's own node (distance 0) over the base definition (distance > 0), producing a self-referential extends edge. Federation patterns put base + extension in different files, which is what this targets. Documented in `pushExtendsRef` JSDoc as a future resolver-pass filter target. Files - src/extraction/graphql-extractor.ts: visitDefinition routes to the new `visitTypeSystemExtension` dispatcher; 6 emit*Extension methods reuse `emitFieldsOf` and the new `pushExtendsRef` helper. Class-level docstring mapping table updated to cover the extension forms (memo scrutiny-area #1 catch by reviewer). - __tests__/graphql-extend-type.test.ts: 3 new cases (5 kinds end-to-end, signature distinction, type_of refs). - __tests__/extraction.test.ts: one existing test flipped from "extend type silently produces zero nodes (v1 out-of-scope)" to "extension node + extends ref emitted". - docs/test-beds/graphql/fixture.graphql: full schema fixture covering definitions and all 5 supported extension forms; auto-discovered by the language-coverage harness. Verification - npm run typecheck (tsgo) — clean. - npx vitest run — 1374 / 34 / 0 (was 1371; +3 new + 1 flipped). - E2E probe on a multi-kind extend fixture: 5 extension nodes, 5 extends refs, fields under the right parent, 0 errors. Reviewer pass — eighth memo-load-bearing review this session: - Class-level mapping table missing the extension rows (memo scrutiny-area #1 docstring rotting). Added. - Same-file self-resolve edge case noted as future resolver filter target. - emitScalarExtension's unreachable status confirmed adequate per its existing JSDoc (memo scrutiny-area #7 doesn't apply to private methods). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ion A #5) Closes the post-refactor backlog item #5: static large_method flags evergreen-large symbols indistinguishably from actively-growing ones; recently_grew surfaces the latter as the actionable refactor target. Architecture: - Migration 027 adds node_loc_history(node_id, indexed_ts, loc) — one row per analyseProject pass that touched the symbol. The per-file content-hash short-circuit means unchanged symbols are not re-snapshotted; the compound index (node_id, indexed_ts DESC) makes "previous snapshot for this node" a sub-millisecond seek. - src/db/queries-loc-history.ts exports recordLocSnapshots (batch insert with the canonical FK-safe SELECT WHERE EXISTS / ON CONFLICT pattern from queries-summaries) and getPriorLocSnapshots (per-node seek; one prepared stmt, N steps). - biomarkers/index.ts: at analyseProject start, capture nowMs once; per-file batch-fetch prior snapshots, evaluate rules with prior + nowMs threaded into RuleContext, persist a fresh snapshot for every symbol whose metrics we computed. - biomarkers/engine.ts: new evaluateRecentlyGrew rule. Triggers when prior_loc >= 20 AND current_loc > prior_loc * 1.5 AND (now - prior_ts) <= 30d. Severity ladder: 1.5–2x info, 2–3x warning, ≥3x error. Metric is round(ratio * 100) so the ranked-mode sort orders by growth percent. Reviewer-memo gates passed: - #4 schema-version asserts both bumped to 27 (foundation + pr19-improvements). - #5 node_loc_history added to BOTH the migration AND schema.sql so fresh installs initialize correctly. - #7 dead-export check: dropped recordLocSnapshot/getPriorLocSnapshot singulars after the reviewer flagged YAGNI; only the batch variants are exported (and called). 9 engine-level tests for the rule logic. Full suite 1393/34/0 (+9 from prior). Persistence path is exercised implicitly by every existing biomarker test (silent green confirms the schema + writes are sound on real fixtures). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

First merge of agentic-backlog #7 (tool family consolidation). codegraph_pending_summaries + codegraph_save_summaries collapse into codegraph_summaries({action: 'pending' | 'save'}). Establishes the family pattern that subsequent #7 merges (coverage, search, admin) will follow: - Per-action handlers extracted to private `_*-{action}.ts` files (underscore-prefixed, no <NAME>_TOOL export — only handle<Action>). Convention documented in registry JSDoc. - Family file (summaries.ts) owns the discriminator dispatch + the consolidated tool description. - Old tool names remain as keep-shims for one release, each with one-time logDebug deprecation note. Per agentic-backlog memo: user signed off on API breakage but soft-landing helps external callers migrate without a hard cliff. - isWriteTool: true on the family — covers the save action; the read-only pending action also gets disabled by --no-write-tools as the conservative default. Tests (5 new) verify: registry surface, missing/invalid action error, action-pending shape parity with legacy tool, action-save input validation, legacy-shim parity. Eval: pre-existing -0.061 regression already on HEAD before this change (verified by stash + re-run); my change does not move it. Logged separately as a pre-existing issue. Reviewer round 1 REQUEST_CHANGES on dead `_handleSummariesFamily` export + stale JSDoc cross-ref to non-existent shim-summaries.ts. Both addressed. Suite 1450/34/0 (was 1445; +5 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two related changes bundled per user policy ("hard cut-over over deprecation period — single-user codebase"): (A) #7-2 coverage family — codegraph_coverage_load is gone; its behaviour is now codegraph_coverage({mode: 'load'}). The existing coverage tool gains 'load' as a fourth mode alongside symbol/ ranked/stats. Per-action handler extracted to private _coverage-load.ts (mirrors the _summaries-*.ts pattern). Family tool marked isWriteTool: true (covers the load action; read modes also hidden under --no-write-tools as the conservative default — same trade as summaries). (B) Retroactive shim removal on #7-1 — codegraph_pending_summaries and codegraph_save_summaries deleted entirely. Family tool codegraph_summaries({action}) is now the only entry. User decision: this is a single-user codebase, no external automation to soft-land. Updated everywhere a legacy name appeared: registry test allowlist, disable-write-tools assertion, hint text in status.ts / module.ts / server-instructions.ts / agent-bridge.ts, comment in tool-types.ts, 3 CLI mirrors (pending-summaries / save-summaries / coverage-load now route to the family tool with the right action/mode arg). Reviewer caught 1 stale-JSDoc + 2 polish: - registry.ts JSDoc still said "AND its keep-shims" → updated - "All 36 tools" → 37 (also bumped) - coverage.ts `source` property description didn't mention mode='load' usage → expanded to cover both surfaces. Suite 1454/34/0 (was 1450; +4 new coverage-family tests). Eval delta unchanged from baseline (pre-existing -0.061; not from this work — verified by stash + re-run). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Third merge of agentic-backlog #7 (tool family consolidation). Collapses three tools into one mode-discriminator-driven family: codegraph_search → codegraph_search({mode: 'exact'}) [default] codegraph_search_fuzzy → codegraph_search({mode: 'fuzzy'}) codegraph_similar → codegraph_search({mode: 'semantic'}) Per-mode handlers extracted to private `_search-fuzzy.ts` / `_search-semantic.ts` (mirrors the `_summaries-*.ts` / `_coverage-load.ts` pattern). Old tool files deleted entirely (no shim — single-user codebase, hard cut-over). Schema notes: - `required: ['query']` removed from inputSchema since mode='semantic' accepts EITHER `symbol` OR `query`. Per-mode validation lives in the handlers (handleSearchSemantic checks for the XOR). - Added per-mode args to the schema with `(mode=…)` prose annotations. Hint-text + comment sweep: - status.ts / module.ts / server-instructions.ts / embed.ts / installer/llm-setup.ts / db/queries-embeddings.ts / bin/codegraph.ts comments — every reference to `codegraph_similar` / `codegraph_search_fuzzy` updated to the family form. - CLAUDE.md MCP-tools table — reviewer-flagged stale row removed, search row expanded to mention all three modes. - 2 CLI mirrors (`codegraph search-fuzzy`, `codegraph similar`) now route to the family tool with the right mode arg. Reviewer caught: - CLAUDE.md tools table still listed `codegraph_search_fuzzy` → fixed (request_changes). - `_search-semantic.ts` exported `handleSemanticSearch` instead of the convention `handleSearchSemantic` → renamed (info polish). Suite 1460/34/0 (was 1454; +6 new family tests, fuzzy/semantic existing tests updated to call the family). Eval delta unchanged from baseline; search-related case still recall=1.00. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Final merge of agentic-backlog #7 (tool family consolidation). Largest by tool count: 5 → 1. codegraph_init → codegraph_admin({action: 'init', path}) codegraph_uninit → codegraph_admin({action: 'uninit', path, confirm}) codegraph_unlock → codegraph_admin({action: 'unlock', projectPath?}) codegraph_sync → codegraph_admin({action: 'sync', projectPath?}) codegraph_index → codegraph_admin({action: 'index', projectPath?, force?}) Differences from prior merges: - Action handlers are inlined in admin.ts (single ~220 LOC file) rather than split into _admin-<action>.ts files. Each body is small (40-65 LOC) with no shared helpers, so the split would be noise. Convention doc in registry.ts unchanged — split is for non-trivial bodies. - Two distinct arg shapes preserved: init/uninit use `path` (required, may target a directory with no .codegraph/ yet); unlock/sync/index use the standard optional `projectPath`. - bypassFreshnessGate + isWriteTool both transfer to the family (all 5 prior tools had both flags individually). CLI commands (init/sync/index/uninit/unlock) intentionally NOT touched in this commit — they directly call CodeGraph APIs (don't go through MCP) so there's no shim concern. CLI family-alignment to mirror MCP shape is a separate task. Hint-text + JSDoc sweep: status.ts / module.ts / history.ts / risk-review.ts / tools.ts / tools/types.ts / bin/codegraph.ts — every reference to a retired tool name updated to the new family form. Reviewer round 1 APPROVE with 2 docstring-rot info items, both addressed in this commit: - src/mcp/tools.ts:49 ToolHandlerOptions.disableWriteTools JSDoc parenthetical updated for the new family-tool roster. - .claude/reviewer-memo.md CLI/MCP-alignment section updated to reflect the codegraph_admin family. Suite 1465/34/0 (was 1460; +5 new admin-family tests + retargeted existing init/sync/index/uninit/unlock/server-options tests). Eval delta unchanged from baseline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per the user directive: CLI tool names match MCP tool names so agents and humans share one vocabulary, no two-set memorization. Family-aligned CLI structure mirrors the MCP families shipped in agentic-backlog #7: codegraph admin init|uninit|unlock|sync|index [path] ↔ codegraph_admin({action: '...'}) codegraph summaries pending|save [json-file] ↔ codegraph_summaries({action: '...'}) codegraph search [query] --mode exact|fuzzy|semantic ... ↔ codegraph_search({mode: '...'}) codegraph coverage [symbol] --mode symbol|ranked|stats|load ... ↔ codegraph_coverage({mode: '...'}) Implementation: - adminCmd / summariesCmd parents declared once at module load (Commander's addCommand pattern; the bare-string `command('admin init')` form collides on second registration). - coverage / search use --mode flag rather than nested subcommand so the common case (`codegraph search foo`, `codegraph coverage`) stays terse — matches the MCP `mode:` arg shape exactly. - 10 retired CLI command bodies deleted; behavior preserved either inside the family handler or rehomed unchanged under the family. - Hint text + JSDoc swept across 5 production files; CLAUDE.md CLI section rewritten to show the family shape; 2 test assertions updated to match new hint strings. Reviewer caught: - File-header JSDoc still showed `codegraph search <query>` (required) after the positional was made optional → updated. - Missing mutual-exclusion guard for --same-language / --different-language in the unified search command (the prior `similar` command had it) → restored. Suite 1465/34/0 (count unchanged; this commit is mostly rewrites of strings + Commander wiring). Eval baseline unchanged from prior commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three friction items + one test regression follow-up to e11b2f9. ## colbymchenry#37 — pool-aware llm auto-detect probeKnownLocalServers() in src/llm/detect.ts probed only the 4 hardcoded legacy endpoints (Ollama 11434, llama.cpp 8080, MLX 8081, LM Studio 1234). When the user spawns a recommended-tier pool via "codegraph llm pool up" (chat 8085-8087, embed 8090-8091), the auto- detect missed it and reported "No local LLM detected" even though a healthy pool existed on the same machine. Fix: extend probeKnownLocalServers(probeTimeoutMs, projectPath?). When projectPath is supplied, read .codegraph/pool.json via readPoolState() (from pool-controller.ts) and probe each member's endpoint additively to — not replacing — the legacy 4. Returns a new "pool: ProbedPool | null" field; null when projectPath is unset or no pool.json exists; arrays empty when pool.json lists members but none respond. The wizard in src/installer/llm-setup.ts surfaces pool members as pool-chat-i / pool-embed-i options prepended before legacy entries in buildChatProviderOptions / buildEmbeddingProviderOptions; resolveChatChoice / resolveEmbeddingChoice parse the new keys via regex dispatch. New test file __tests__/llm-detect-pool-aware.test.ts (6 cases): no projectPath, missing pool.json, reachable chat member, reachable embed member, unreachable members, both reachable together. Static import for readPoolState — no circular dep (pool-controller.ts does not import detect.ts). ## colbymchenry#45 — bin/codegraph.ts:1251 legacy alias typo Error message referenced config.llm.summarizeLlmModel as a "legacy" field. That identifier doesn't exist; the real legacy alias per LEGACY_LLM_FIELD_MAP in src/config.ts is the flat-field config.llm.chatModel. Fix: replace the parenthetical hint accordingly. ## colbymchenry#46 — stale config field names in agent-facing error messages Errors in src/mcp/tools/_search-semantic.ts (lines 74, 137), src/mcp/tools/ask.ts (line 110), src/mcp/tools/dead-code.ts (line 30) referenced LEGACY field names (config.llm.chat, config.llm.askChat, config.llm.embeddings) as if they were canonical — these strings are sent to the user when LLM is unavailable, so a user following the message ends up configuring stale field names. Fix: lead with the canonical purpose-suffixed names (summarizeLlm / askLlm / embeddingLlm), demote legacy aliases to a footnote ("legacy ... also accepted"). ## Test regression fix Commit e11b2f9 renamed the status output label "Chat model:" to "Summarize model:" in src/mcp/tools/status.ts:596 but did not update __tests__/mcp-llm-visibility.test.ts which still asserted on the old wording. Three assertions updated (lines 70, 99, 127) plus one describe-text aligned (askChat differs from chat -> askLlm differs from summarizeLlm). ## Reviewer pass Independent reviewer (with .claude/reviewer-memo.md prepended) caught one item under memo recurring scrutiny area #1 (docstring rot): the module-level JSDoc at src/llm/detect.ts:1-13 still described the module's job as "detect Ollama, llama.cpp, MLX and LM Studio instances on conventional ports" without mentioning pool-aware probing. The function-level JSDoc on probeKnownLocalServers was updated by the implementer but the file header was not. Fixed in this commit. ## Verification - 10 files changed, +180/-22, plus 1 new test file (189 lines, 6 tests) - npm run typecheck — clean - 82/82 tests pass across 7 LLM-related test files - probeKnownLocalServers call sites: exactly 2 (definition + src/installer/llm-setup.ts:83 caller) - New exports ProbedPool + (now-exported) ProbedServer both have concrete in-tree callers — verified per reviewer-memo item #7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… intent coverage hint + compare_to_ref skipped/suppress flags Five friction-tracker items addressed. Three sub-agents in parallel on disjoint files; reviewer pass with .claude/reviewer-memo.md prepended caught two real correctness bugs and two info-level items which are all addressed in this commit. ## colbymchenry#17 — dead_code static excludes test-bed fixtures src/mcp/tools/dead-code.ts gains an excludeFixtures: boolean arg (default true). When true, formatStaticDeadCode requests 4× overscan from findGraphCandidates, runs filterFixtureNodes (regex match against docs/test-beds/, __tests__/fixtures/, test/fixtures/, spec/fixtures/), then slices to the user's maxCandidates. New helper filterFixtureNodes is module-private (no speculative export). Reviewer-caught correctness gap: the schema described excludeFixtures as a general filter but llmFindDeadCode in mode='judge' was called with only { maxCandidates }, so fixture filtering silently no-op'd in judge mode. Fixed by extending the schema description to caveat "applies to mode='static' only — judge mode tracked as follow-up"; the wider judge-mode wiring is non-trivial (llmFindDeadCode's args interface needs extending) and is left as a tracked task. Reviewer-caught info gap: 4× overscan doesn't suffice when fixture density exceeds 75% of the candidate set. Now surfaces a warning when overscan was exhausted AND we still under-filled — the agent sees "raise maxCandidates if you need more" instead of silently truncating. New test file __tests__/mcp-dead-code-fixtures.test.ts (3 cases): filter on, filter off, spec/fixtures pattern. ## colbymchenry#19 — search mode='intent' attaches coverage hint on 0 hits src/mcp/tools/_search-intent.ts. When handleSearchIntent returns empty, computes summary coverage and appends one of two hints: - coverage < 50%: "intent-search depends on LLM summaries (current coverage X%) — run codegraph summarize to expand the corpus, or fall back to mode='exact' / codegraph_grep" - coverage >= 50%: "0 hits at X% coverage — concept may not be summarised yet, or may not exist in the codebase. Try codegraph_explore" Non-empty results unchanged. Coverage-stat errors are caught and logged so they don't fail the search. Reviewer-caught correctness bug: getSummaryCoverage(cg.queries) was called WITHOUT a kinds filter; the helper's own JSDoc warns that "counting parameters/imports/files in the denominator would understate coverage and confuse the user". Fixed by passing SUMMARIZABLE_KINDS (already exported from src/llm/summarizer.ts and used by status.ts the same way). Three OTHER call sites in the codebase also pass no kinds — filed as colbymchenry#49 follow-up since they're out of scope for this commit. New tests added to __tests__/search-intent.test.ts (2 cases: low-coverage hint, high-coverage hint). ## colbymchenry#23 — compare_to_ref surfaces skipped (non-TS / non-indexed) files src/compare/index.ts adds filesSkipped: number to CompareResult. The compare logic counts files git reports as changed but cg can't structurally diff (non-indexed languages, .md/.json, binary, etc.). Formatter src/mcp/tools/compare.ts surfaces "> N file(s) skipped (non-indexed or non-TS)" only when count > 0 (no noise on clean diffs). Manual eyeball on HEAD~5 confirmed correct count. ## colbymchenry#24 — compare_to_ref suppressLineRangeOnly flag src/compare/index.ts adds suppressLineRangeOnly: boolean (default false for backward compat — existing callers asserting on result.totals.modified would break). Adds lineRangeOnlyCount to FileDelta. New helpers isPureLineRangeOnly / applyLineRangeOnlySuppression (module-private). When true, FileDeltas where every modified symbol's reasons are exactly ['line range changed'] (no signature change, no modifier flip, no body diff) collapse into a single per-file roll-up: "src/foo.ts: 14 symbols renumbered (no content change)". Mixed files (real change + renumber) keep real changes individually shown. MCP schema in compare.ts exposes the flag. Manual eyeball on HEAD~5 of this repo: 96/174 modified symbols (55%) were pure-renumber noise that suppression collapsed cleanly into 7 roll-up lines. New tests added to __tests__/compare.test.ts (4 cases): filesSkipped counts non-TS files, filesSkipped: 0 doesn't add the line, suppress collapses pure-renumber, mixed files keep real changes individually. ## colbymchenry#34 — at_range rejects paths outside project root src/mcp/tools/at-range.ts. New validateFileWithinRoot helper: path.resolve canonicalization on both sides, checks "equals root OR starts with root + sep". Single-range and bulk-range forms both validate; bulk form fails the whole call if ANY range is out-of-root (no silent filtering). Reviewer-caught info: documented symlink limitation in JSDoc — within-root symlinks pointing OUTSIDE the root will pass; if untrusted symlinks become an issue, swap to fs.realpathSync. New tests added to __tests__/at-range.test.ts (2 cases: traversal '../../etc/passwd' rejected, bulk form with one out-of-root range fails the whole call). ## Reviewer-caught items beyond the original 5 - New friction colbymchenry#48 filed: typescript-lsp plugin reverts sub-agent Edit calls on first application (not a codegraph defect; harness interaction). Sonnet caught it during colbymchenry#23/colbymchenry#24 work and re-applied successfully on second pass. - New friction colbymchenry#49 filed: 3 other getSummaryCoverage call sites in status.ts and bin/codegraph.ts have the same denominator-inflation bug — out of scope for this commit, tracked separately. ## Verification - 9 files (8 modified + 1 new test), +398/-13 - npm run typecheck — clean - 50/50 tests pass in focused slice (compare / at-range / mcp-dead-code-fixtures / search-intent) - New module-private helpers (filterFixtureNodes, isPureLineRangeOnly, applyLineRangeOnlySuppression, validateFileWithinRoot) all have concrete in-tree callers in the same diff — no speculative exports per reviewer-memo item #7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…polish items Eight friction-tracker items addressed in parallel by sub-agents (2 Haiku, 1 Sonnet); reviewer caught one real correctness edge case (bucket overlap on degenerate fresh-index shapes) plus two info items, all addressed in this commit. ## colbymchenry#21 — at_range cost-benefit JSDoc Doc-only update to src/mcp/tools/at-range.ts. Tool description and JSDoc now state "pays off most on dense files (100+ symbols) and multi-range bulk lookups; for tiny preview fetches on small files, raw `head -N` is comparable." No code change. ## colbymchenry#25 — blame surfaces rename detection inline src/git-utils.ts gains a new helper `getFileFollowEarliestTs` that runs `git log --follow --format=%aI -- <path>` (5 s timeout, ISO timestamp). src/mcp/tools/blame.ts compares the rename-aware oldest commit against the line-range-only timeline's oldest. When `--follow` reaches further back, appends a warning that the timeline truncated at the file's rename and points at `git log --follow <file>` for the full history. Edge cases handled: not-a-git-repo, timeout, empty timeline. Test approach uses `vi.spyOn` to mock pre-rename history because real fixtures are unreliable: modern git's `git log -L` follows renames via content-similarity tracking, making a deterministic black-box rename-fixture impossible. ## colbymchenry#26 — hotspots split into 3 mutually-exclusive categories src/db/queries-history.ts gains `getCategorizedHotspots` and src/mcp/tools/hotspots.ts gains a `category: 'risk' | 'maintenance' | 'brittle' | 'all'` arg (default 'risk' for backward compat). Thresholds use 75/25 percentile rather than hardcoded magic numbers — they adapt as the project grows. Buckets: - risk : high centrality AND high churn — where bugs hide - maintenance : high churn AND not-high centrality — refactor target - brittle : high centrality AND not-high churn — stable critical Reviewer-caught correctness bug: original filters used `<= low` for the secondary axis, which collapsed buckets when high == low (fresh index where centrality is uniformly zero, or repos where every file has identical churn). A file at the threshold could appear in both risk AND maintenance simultaneously. Fixed by switching maintenance and brittle to `< highThreshold`, making them strictly disjoint even on degenerate inputs. Also added a more-hint when any section hit the per-category cap (the existing `category='risk'` path already had this; `category='all'` now mirrors). New `__tests__/hotspots.test.ts` (4 cases) covers all-section rendering, single-category dispatch, and the backward-compat default path. ## colbymchenry#27 — search centrality:high differentiates "hook hasn't run" vs "no node met the threshold" src/mcp/tools/search.ts. `probeCentralityFilterCulprit` now runs a sub-millisecond probe `SELECT 1 FROM nodes WHERE centrality IS NOT NULL LIMIT 1` (uses the existing `idx_nodes_centrality` index). When ALL nodes have NULL centrality the agent gets the existing "centrality hook hasn't run — run codegraph index" hint. When SOME nodes have centrality but none cleared the filter, a different hint suggests relaxing the threshold. Two-case hint instead of one. ## colbymchenry#28 — search exact promotes multi-token-query warning to pre-result src/mcp/tools/search.ts. `buildConceptHintIfNeeded` now returns `{ preResult, postResult }` instead of a single string. When the query splits into 2+ space-separated non-qualified tokens (likely "multiple symbol names"), the agent gets a leading hint to call search per name OR use codegraph_explore — BEFORE the result list rather than buried after. Field-qualified tokens (`kind:function lang:typescript`) and single-free-token queries are unchanged. ## colbymchenry#33 — callers on "constructor" with no callers explains the instantiates-edge model src/mcp/tools/callers.ts. When the resolved symbol is `kind=method && name=constructor` AND the callers list is empty, appends a one-line note: "constructors are invoked via `new ClassName(...)`, which graph-edges as `instantiates` on the parent class. To find construction sites, run codegraph_callers on the enclosing class instead of 'constructor'." Both the multi-match and single-match paths got the note (guarded by the same kind+name+empty check). Constructors WITH callers (e.g. via super()) render normally — no false positive. ## colbymchenry#35 — node.symbol tie-break prefers non-fixture, then centrality src/mcp/tools/symbol-resolver.ts. `pickFromMultipleExactMatches` now filters out fixture paths first (falls back to all-fixture when that's all that matches), then sorts by centrality DESC (NULL → 0). A `helper` symbol that exists in both `src/core.ts` and `docs/test-beds/fixture.ts` resolves to `src/core.ts` as the displayed primary. Tier #3 (last_touched_ts) deferred — data not in the resolver's existing query. Reviewer-caught DRY issue: the fixture-path regex set was duplicated between symbol-resolver.ts and dead-code.ts (introduced by parallel sub-agents on the same brief). Extracted to `isFixturePath` in src/mcp/tools/shared.ts; both consumers now import the single source. ## colbymchenry#49 — getSummaryCoverage denominator threading (3 call sites) src/bin/codegraph.ts (lines 348, 1461) + src/mcp/tools/status.ts (line 440). All three pass `SUMMARIZABLE_KINDS` to getSummaryCoverage to match the canonical pattern from the previously-fixed _search-intent.ts:218. Without this, the helper falls back to COUNT(*) which inflates the denominator with parameters / imports / file nodes — its own JSDoc explicitly warns against this. ## Test re-additions Sub-agent #1 deleted its own test files for colbymchenry#33 and colbymchenry#35 (a brief misread — "DO NOT commit" was interpreted as "DO NOT leave tests in repo"). Re-added as `__tests__/mcp-callers-constructor-and-fixture-tiebreak.test.ts` covering: constructor-with-no-callers note appears, non-constructor-method note absent, name-collision picks non-fixture primary. ## Verification - 15 modified files + 2 new test files, +619/-55 - npm run typecheck — clean - 74/74 tests pass across 9 LLM/search/hotspots-related test files - New exports: `isFixturePath` (shared.ts), `getCategorizedHotspots` (queries-history.ts), `getFileFollowEarliestTs` (git-utils.ts) — all have concrete in-tree callers in the same diff per reviewer-memo item #7 Reviewer pass with .claude/reviewer-memo.md prepended caught: - (request_changes) bucket-exclusivity edge case → fixed - (info) isFixturePath duplication → deduped - (info) category='all' missing more-hint → added Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Subtask #7.1 of the embedding-features arc Stage 4. New MCP tool: given changed files OR symbols, return the top-K semantic lookalikes — candidates that may need the same kind of change. Pipeline: - Resolve changed symbols via SQL (excludes file/import/export kinds). - Detect active embedding model (via resolveLlmConfig + DB probe). - For each changed symbol: getEmbeddingForNode → bytesToVector → findSimilarViaVec(k + N), aggregating per-nodeId max similarity. - Filter out the changed set itself, sort by max-sim desc, slice top K. - Markdown output: changed-symbols header + lookalikes with `name (kind) — score`, file:line, signature. Read-only (no DB writes). Defensive: returns clear text rather than throwing when embeddings are absent or no inputs resolve. Inputs: `files: string[]` (project-relative paths) AND/OR `symbols: string[]` (qualified or simple names). At least one required. Optional `k` (default 5, max 50) and `projectPath`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Subtask #7.2. 8 vitest cases over 3 describe blocks: - Input validation: missing files+symbols → error; bogus paths → "no symbols resolved"; resolved-but-no-embeddings → clear text. - Resolution + ranking: file → symbols → KNN → output mentions changed names; symbol-name input resolves the same way; the "Top N lookalikes" section excludes the changed-symbol set itself; k clamps to 50 silently. - Registration: name + description + schema shape. Synthetic deterministic embeddings via upsertSymbolEmbedding so tests stay LLM-free. Vec-loaded assertions are gated by `hasVecExtension()` so the suite stays green on CI runners that don't ship sqlite-vec. Suite: 2207 -> 2215 (+8). Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CLI surface for the Stage 4 #7 codegraph_review_neighbors MCP tool. Mirrors the MCP shape: `--files a,b,c` and/or `--symbols X,Y,Z` plus optional `-k <n>`. Closes the post-Stage-7 alignment gap surfaced by the audit task. Audit summary: - 44 MCP tools registered. - All MCP tools that match codegraph's "operate on a project / its data" shape now have a CLI counterpart. - By-design MCP-only: codegraph_session, codegraph_note, codegraph_local_chat, codegraph_playbook (agent-state / conversational shapes that don't translate to CLI). - By-design CLI-only: install, llm setup, llm pool up/down/status, serve --mcp, viewer (lifecycle / detached-process shapes the MCP server can't safely manage). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…raph friction items Biomarker pass: 10 findings (1 error / 3 warnings / 6 info) -> 0/0/0. - has() LRU low_coverage: new __tests__/node-lru-cache.test.ts exercises the previously 0%-covered path via QueryBuilder.nodeCache. - hasExtraStatement complex_method (cyc 17 -> 0): table-driven STEP_HANDLERS dispatch in src/mcp/tools/sql.ts replaces the inline state machine. - buildQueryOutput / appendCompletenessAndBudget long_parameter_list: bundled positional args into *Args interfaces. - findLowCoverage + buildCoverageTips magic_number: extracted PCT_SCALE / COVERAGE_PCT_ROUND / CENTRALITY_ROUND + MS_PER_DAY / STALE_AFTER_DAYS named constants (docstrings cite values to satisfy stale_doc). - 4 unused exports dropped: RULE_NAMES deleted; LabelConfidence/LabelSource un-exported; duplicate CURRENT_SCHEMA_VERSION at migrations/index.ts:226 removed (derive directly in migrations.ts). Codegraph friction fixes: - #7 unused_export FP on aliased named imports: empirically zero remaining FPs on the in-tree corpus. Rule comment updated to reflect that emitNamedImportRefsFromFile + pickMatchingImport resolve aliased imports correctly. - colbymchenry#8 n_xxxxxxxx UIDs not accepted by codegraph_biomarkers / _coverage / _role mode=symbol: each tool's resolveSymbolToNodeId now checks RefIdCache.isUid first and resolves via the per-server cache. handleGetRoleOf bundled into HandleGetRoleOfArgs after refIds pushed it past the long-param tier. Punctuation guard aligned across all three siblings. Regression test in __tests__/biomarkers.test.ts mints a UID via codegraph_find and round-trips it through codegraph_biomarkers. Gates: tsgo typecheck clean, vitest 2633/0/34, codegraph_biomarkers stats empty after sync, independent reviewer APPROVE on both diff slices. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nt cache key Same latent-staleness shape as friction #7 (closed in 9cb4ab5): the centrality fingerprint at `last_centrality_fingerprint` hashed node + edge counts + max(updated_at) but NOT the algorithm-defining constants (PR_DAMPING / PR_ITERATIONS / PR_EDGE_KINDS). If any of those change, the stored fingerprint can match → stale centrality scores persist until the next clearStructural. Probability of bite today: low — those constants haven't moved in a long time. But the cost of preemption is one string concat; the cost of post-hoc diagnosis is hours. Other surfaces I investigated and consciously left alone: - summary_store / embedding_store / role_assignments — already self-healing via content-addressed body_hash - parse_cache — has PAYLOAD_VERSION embedded per row - hnsw_meta — rebuilt on row-count drift (M / efConstruction are stable) - node_metrics / code_health_findings — already invalidated by BIOMARKER_CACHE_KEY (which now rides on EXTRACTION_LOGIC_VERSION) - CHURN_ALGO_VERSION — already invalidates via explicit storedAlgo !== current compare - PAYLOAD_VERSION ↔ EXTRACTION_LOGIC_VERSION — intentionally decoupled per the docstring at extraction-logic-version.ts:18 One soft gap surfaced but NOT fixed in this diff: trained heads (role / dead-code / test-need / doc-quality / refactor-priority) use a 24h file-mtime freshness gate that doesn't auto-invalidate on EXTRACTION_LOGIC_VERSION bump. Trade-off is heavyweight (5 heads × training cost per extractor bump) so I left it as the user's call. Implementation: - `centralityAlgoTag()` returns `algo:d={damping}|i={iterations}|k={sorted edge kinds}` — deterministic across literal-tuple reorderings. - `computeFingerprint` prepends the tag, so any constant tweak produces a different prefix. - Regression assertion in `__tests__/mcp-reindex.test.ts` checks the algo-tag prefix is present in the stored fingerprint. - Module docblock updated to describe the four-component fingerprint and the algo-constant force-recompute trigger. Gates: tsgo typecheck clean, vitest 2633/0/34, codegraph_biomarkers stats empty, independent reviewer APPROVE (after one docblock-rot fix on a first-pass REQUEST_CHANGES — module-level header was missed on the first attempt). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This was referenced May 1, 2026

perf+fix(llm): preserve LLM caches across --force [WIP 2/5] #4

Closed

feat+perf(llm): split chat/askChat provider + summary batching [WIP 3/5] #5

Closed

fix: GC orphan summaries + WASM savepoint nesting [WIP 4/5] #6

Closed

andreinknv closed this May 1, 2026

andreinknv deleted the pr5-mcp-visibility branch May 1, 2026 11:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): surface LLM provider routing in status/ask/similar [WIP 5/5]#7

feat(mcp): surface LLM provider routing in status/ask/similar [WIP 5/5]#7
andreinknv wants to merge 1 commit into
pr4-gc-wasm-fixesfrom
pr5-mcp-visibility

andreinknv commented May 1, 2026

Uh oh!

andreinknv commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

andreinknv commented May 1, 2026

Commits

Summary

Test plan

Uh oh!

andreinknv commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant