fix: GC orphan summaries + WASM savepoint nesting [WIP 4/5]#6
Closed
andreinknv wants to merge 2 commits into
Closed
fix: GC orphan summaries + WASM savepoint nesting [WIP 4/5]#6andreinknv wants to merge 2 commits into
andreinknv wants to merge 2 commits into
Conversation
Surfaced by end-to-end measurement: after running `--force` with two different chat models (Phase A-MLX with qwen3-coder, then Phase A bridge+batched with claude-haiku-4-5), the codegraph self-repo's symbol_summaries table held 723 rows — 675 fresh haiku entries plus 48 stale qwen3-coder entries whose node_ids no longer exist in the current `nodes` table. The orphans accumulated because the cache- preservation work in bd26945 (clearStructural with FK toggle) left summary rows behind on every --force, and rows that the by-content- hash fallback in summarizeAll didn't reclaim onto a current node_id were never cleaned up. Bookkeeping noise (inflated row counts, misleading Summary coverage metrics) and slow content_hash index growth — no logical bug, but worth fixing while the design is fresh. ## Implementation New `QueryBuilder.pruneOrphanSummaries()` runs: DELETE FROM symbol_summaries WHERE node_id NOT IN (SELECT id FROM nodes) inside a transaction, with before/after row counts captured atomically in the same transaction. Returns `{ summariesDeleted, embeddingsDeleted }`. Embedding rows cascade-delete via the existing FK CASCADE on `symbol_embeddings.node_id REFERENCES symbol_summaries(node_id)`. Wired into `startBackgroundSummarization` to run after Phase 1 (summarize) completes, before Phase 2 (embed). Logged via logDebug only when something was actually deleted. ## Timing rationale MUST run AFTER summarizeAll completes — by then any genuinely relocated summaries (function moved files, line shifted, extraction- version drift) have been re-attached to their new node_ids via the content-hash fallback inside summarizeAll. Calling DURING summarizeAll would race the reclaim and lose cache hits. NOT called in no-chat configurations: orphans there might still be legitimately reclaimable by a FUTURE chat-enabled session via the content-hash fallback. Pruning prematurely would force the next chat session to regenerate from scratch. Acceptable cost (storage growth in configs that never gain a chat backend) for correctness in the more common case. Documented in the method JSDoc. ## Tests `__tests__/migrations-022.test.ts` — 2 new tests: - `pruneOrphanSummaries deletes only summaries whose node_id is gone` — sets up 3 nodes + 3 summaries + 2 embeddings, simulates clearStructural by toggling FKs OFF and DELETing 2 nodes (the exact post-clearStructural-then-reindex state), then prunes. Asserts: 2 summaries deleted, 1 embedding cascade-deleted, 1 live summary + 1 live embedding survive. - `pruneOrphanSummaries is a no-op when nothing is orphaned` — asserts zero deletions and zero state change when all summaries have matching nodes (idempotent on healthy DB). Suite: 1095 / 13 skip / 0 fail (was 1093). ## Reviewer trail Two passes. Pass 1: REQUEST_CHANGES + 1 substantive finding (the four COUNT(*) reads were taken outside the DELETE transaction, leaving an inter-process race window where the embeddingsDeleted count could be wrong) + 2 info findings (directory_summaries scope, no-chat limitation). Substantive fix: wrapped all four counts AND the DELETE inside a single this.db.transaction(() => { ... return obj; })() so the snapshot is atomic. Info finding 3 addressed by adding a JSDoc paragraph documenting the no-chat-session reasoning. Pass 2: APPROVE + 1 info finding (the WASM adapter doesn't support nested transactions — not actionable here since pruneOrphanSummaries is never called from another transaction). ## Out of scope (deferred to follow-up) - `directory_summaries` rows for deleted directories accumulate similarly. Different keying (dir_path, not node_id) means a different prune query. Worth a future PR. - WASM adapter savepoint support — the WasmDatabaseAdapter.transaction() uses bare BEGIN/COMMIT, no savepoints. Future callers that nest pruneOrphanSummaries inside another transaction would hit a SQLite "cannot start a transaction within a transaction" error on WASM only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the two follow-ups noted in 1e5fa19's commit message: ## directory_summaries orphan pruning Same issue `pruneOrphanSummaries` fixed for symbol_summaries, but for directory_summaries — when a dir is deleted/renamed/excluded its summary lingers indefinitely (no FK anchor; dir_path is the PK). `QueryBuilder.pruneOrphanDirectorySummaries()`: - Reads all `files.path` rows. - For each file, computes the IMMEDIATE PARENT directory via `path.posix.dirname(filePath.replace(/\\/g, '/'))` — exactly mirroring what `dir-summarizer.ts:groupByDir` writes. Crucially does NOT walk the ancestor chain (the first draft did, which would let stale intermediate-dir summaries survive — e.g. a `src` summary from when files lived directly in `src/` but have since all moved into `src/core/`). - Inserts the live set into a `TEMP TABLE _live_dirs` and runs `DELETE FROM directory_summaries WHERE dir_path NOT IN (...)`. - All four COUNT(*) reads + the DELETE share one transaction so the before/after snapshot is atomic (same race fix as `pruneOrphanSummaries`). - Returns `{ directorySummariesDeleted }`. Wired into `startBackgroundSummarization` to run after Phase 3 (dir summarizer). Same timing rationale as symbol prune: must run AFTER the writer has had its chance to refresh stale entries, or we'd race its writes for newly-summarised dirs. ### Tests in `__tests__/migrations-022.test.ts` (3 new) - Live dirs preserved, orphan dirs pruned (mixed scenario with 4 indexed files spanning src/a, src/b, and src directly). - Empty files table prunes everything (degenerate case). - **Stale ancestor-only dir IS pruned** — regression test for the reviewer-caught under-pruning bug. Indexes only `src/core/foo.ts`, plants summaries for both `src/core` (live) and `src` (stale ancestor-only). Asserts only `src/core` survives. Pre-fix this would have left `src` alive due to the ancestor walk. ## WASM adapter savepoint nesting `node-sqlite3-wasm` doesn't transparently handle nested transactions like better-sqlite3 does. Pre-fix, a caller wrapping a `transaction()` inside another `transaction()` would get "cannot start a transaction within a transaction" — only on the WASM backend, silently OK on native. The reviewer flagged this in 1e5fa19's pass 1 as out-of-scope. `WasmDatabaseAdapter.transaction()` in `src/db/sqlite-adapter.ts`: - Added `private _txDepth = 0` field tracking nesting depth on the connection. - Outer call (depth 0) → BEGIN/COMMIT/ROLLBACK. - Inner call (depth > 0) → SAVEPOINT/RELEASE/ROLLBACK TO + RELEASE. - Savepoint name `cg_sp_${depth}` is depth-tagged so concurrent depth values produce stable distinct names; safe to inline (depth is a non-negative integer, no SQL-injection risk). - Rollback paths wrapped in try/catch — if the rollback itself fails, the original error is annotated with a "(NOTE: ROLLBACK also failed: ...)" suffix before re-throw. Operators reading the log get a signal when connection state may be uncertain. - `finally { this._txDepth-- }` restores depth on both success and failure paths. `WasmDatabaseAdapter` is now exported (was un-exported `class`) so tests can drive it directly on machines where better-sqlite3 is installed and `createDatabase` would otherwise pick native. Brief JSDoc on the export explains the test-only intent. ### Tests in `__tests__/wasm-savepoint.test.ts` (NEW, 8 tests) - Single-level commit - Nested commit (outer + inner both succeed) - Inner throw + outer recovers (writes survive even though inner's SAVEPOINT was rolled back) — this is THE case that errored pre-fix. - Outer throw → full rollback (zero rows survive) - Depth counter resets on success AND failure paths (third-run-after- failure case proves the `finally` restores depth on both branches). - Three-deep nesting with L3 failure rolled back inside L2. - transaction() return value passes through. - transaction() args forwarded to wrapped function. Suite: 1106 / 13 skip / 0 fail (was 1095, +11 new tests). ## Reviewer trail Two passes. Pass 1: REQUEST_CHANGES + 1 substantive (under-pruning ancestor-walk bug — would let stale `src` summary survive when files all moved into `src/core`) + 1 info (silently swallowed outer-rollback errors). Substantive: rewrote prune to use immediate-parent only + added regression test that would have failed pre-fix. Info: annotate caught rollback errors onto the original exception message. Pass 2: REQUEST_CHANGES + 1 finding (stale JSDoc still described the removed ancestor-walk behavior) + 1 info (frozen-Error edge case in error mutation — theoretical, not actionable). JSDoc fixed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
Author
andreinknv
added a commit
that referenced
this pull request
May 1, 2026
… server-config flags Tooling-gap backlog (codegraph/docs/codegraph-tooling-gaps.md) closed: #1 freshness severity bucket — `classifyFreshness` with fresh|recent|stale|very_stale #2 allowStale flag — opt-in bypass for the heavy-drift gate, registry-injected schema #3 module format in status — `module-format.ts` parses package.json + tsconfig (JSONC-safe) #4 codegraph_imports tool + import-classifier — file/directory/bare/unresolvable filters #5 dynamic imports — extractor catches `import('…')` + `require('…')`, incl. template_string #6 build-context refs — new `build_context_refs` table for `__dirname` / `import.meta.*` #7 files.is_test flag — column populated by glob; surfaced in status as `(N test)` colbymchenry#11 summarize-also-embeds (discovered while dogfooding) — `cg.summarizeAll()` chains `embedAllSummaries`; new `cg.embedAll()` for embed-only path; CLI `codegraph embed` CLI/MCP alignment (5/32 → 33+/35): - 13 new CLI commands via `runViaMCP` shim: callers, callees, impact, node, similar, biomarkers, imports, help-tools, explore, hotspots, dead-code, config-refs, sql-refs, module-summary, role, coverage-query, pending-summaries, save-summaries, review-context - 7 new MCP tools: codegraph_imports, codegraph_embed, codegraph_summarize, codegraph_sync, codegraph_reindex, codegraph_coverage_ingest, codegraph_init, codegraph_uninit, codegraph_unlock, codegraph_affected MCP server-level operator config (`codegraph serve --mcp`): - --no-write-tools / --allow-stale-default / --disable-tool (sandboxing) - --llm-endpoint / --llm-chat-model / --llm-ask-model / --llm-embedding-model / --llm-api-key (operator LLM config; per-project config wins on conflict) - New CODEGRAPH_LLM_* env vars wired through `mergeLlmEnv` in resolveLlmProviders Architectural cleanups: - `bypassFreshnessGate` and `isWriteTool` declarative flags on ToolModule (replaces growing string-comparison chain in execute()) - `withAllowStale` registry injection only on tools that DO see the gate - DRY of inline copy-paste in 3 hooks → `src/index-hooks/enclosing.ts` - `LlmClient.isEmbeddingReachable` for split-provider correctness - SyncResult `lockContention` flag → handleSync emits distinct retryable message - `clearStructural` deletes from build_context_refs (was orphan-leaking on --force) - cli:dev npm script + tsx CLI fixed (web-tree-sitter `import type` for type-only refs) Migrations: 023-files-is-test.ts — add `files.is_test` 024-build-context-refs.ts — add `build_context_refs` table Reviewer rounds: 11 total, all REQUEST_CHANGES addressed inline. Notable fixes: - JSONC URL strip via state machine (was eating `https://` tails) - classifyFreshness very_stale now requires isStale (in-sync-but-old → recent) - Dynamic imports also match template_string nodes - process.exit deferred until after finally cleanup in runViaMCP - --same-language / --different-language mutual exclusion guard - help-tools CLI bypasses isInitialized (works without a project) - handleUninit sweeps projectCache by getProjectRoot (no dangling alias leaks) - handleAffected errors instead of silently dropping unsupported glob filters - mergeLlmEnv preserves precedence: legacy flat config wins over env-synthesised block Suite: 1268 passing, 1 expected red (colbymchenry#8 — undecided), 13 skipped, 1 todo, 0 regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 3, 2026
After the structural diff lands, the natural next question is "did
my edits introduce or clean up any biomarker findings?". Pre-PR the
agent had to read the persisted index AND remember the baseline
state — neither of which travels well across an edit-loop. This
extension answers it from a git ref + in-memory analysis only, no
fresh sync required.
How it works
------------
For each file already extracted by compareToRef, run the existing
per-file biomarker engine (parseSource + findNodeInTree +
computeMetrics + evaluateRules) on BOTH sides. Diff findings by
(qualifiedName, kind, biomarker):
- only-in-after → added (new regression)
- only-in-before → cleared (refactored away)
- both → carried (status quo, surfaced as a count only
so the report doesn't double in size)
Cross-file biomarkers (unused_export, god_class, feature_envy) are
out of scope: they need the whole graph at the baseline ref, which
would require persisting and replaying graph state. Per-file rules
already cover 8 of 11 biomarker types, which is the dominant signal
for an edit-loop self-report.
Surfaces
--------
- src/compare/index.ts — new FindingDelta / FindingsDelta exports,
evaluateFileFindings + computeFindingsDelta helpers, findingsDelta
flag on CompareOptions, optional findingsDelta field on FileDelta.
- src/mcp/tools/compare.ts — new findingsDelta arg + render sections
("+ findings introduced", "− findings cleared", "carried over: N").
- src/bin/codegraph.ts — --findings-delta flag on compare-to-ref CLI.
- src/biomarkers/index.ts — exports ANALYSABLE_KINDS and
ANALYSABLE_MIN_LOC (the in-memory analysis needs the same gate as
the indexed pass).
Tests (4 new in __tests__/compare.test.ts)
- regression case: gnarly edit on alpha → added findings on alpha
- cleanup case: gnarly baseline + clean current → cleared findings
- carry case: gnarly baseline + slightly-tweaked-but-still-gnarly
current → carried > 0, added/cleared = 0
- no-change case: empty findings delta when nothing differs from REF
Suite: 1370/13/0 (was 1366; +4 new). tsc clean.
Reviewer pass
-------------
APPROVE with two info-level fixes addressed before commit:
- carried bucket was populated but never rendered → added a
collapsed "N pre-existing findings carried over" line.
- carried path was untested → added the tweaked-but-gnarly case.
Dogfooded on this very PR; the report correctly surfaced findings
my own changes introduced, including a long_parameter_list warning
on computeFindingsDelta that I bundled into a FileSnapshot interface
before commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 3, 2026
…olbymchenry#8) Promotes the resolver numeric `metadata.confidence` to a categorical, queryable column on `edges`. `codegraph_callers`, `_callees`, `_impact` now annotate each row with `*(INFERRED)*` when the edge is not concrete, and accept a new `minConfidence` arg to drop edges below a chosen level — useful for refactor-impact analysis where false-positive reach inflates the blast radius. Migration 032 adds `edges.confidence TEXT` (nullable; read-cast in EDGE_SCHEMA collapses NULL → `EXTRACTED`). Nullable lets legacy / structural / extractor-direct edges need no producer change. Index on the column for filter-down queries. Idempotent ADD COLUMN with table-exists guard following the convention from migrations 020 / 021 / 023 / 030. Producer: `ReferenceResolver.createEdges` calls a new free `classifyConfidence(resolvedBy, score)` helper: - import / qualified-name / file-path / exact-match → EXTRACTED - framework with score >= 0.85 → EXTRACTED, else INFERRED - instance-method / fuzzy → INFERRED AMBIGUOUS is not stamped from the resolver path in v1 — needs tie-tracking from the candidate picker. Tracked as B3. Consumers (`result-formatters.ts`): - `formatConfidence(edge)` — markdown italic suffix (empty for EXTRACTED; `*(INFERRED)*` / `*(AMBIGUOUS)*` otherwise). - `parseMinConfidence(raw)` — returns level / null / errorResult. - `filterByConfidence(rows, min)` — drop below threshold. - `CONFIDENCE_RANK` — numeric ordering for inline filter loops. Wired into all four code paths in `callers.ts` (collectCallers, collectCallersForSource, collectTypeUsers, formatGroupedCallers), both paths in `callees.ts`, and `impact.ts` with a `filterImpactByConfidence` BFS that prunes nodes unreachable along the surviving edge set. Type cleanup: `Edge.confidence` added with JSDoc; the long-dead `Edge.provenance` field also got a JSDoc note pointing at the SCIP-integration-reservation explanation in memory. Reviewer-memo #4 catch (schema-version test forgetfulness): noticed `__tests__/pr19-improvements.test.ts` was still asserting `toBe(29)` even though migrations 030 and 031 had shipped. Catch-up to 32 in this commit covers both the missed bumps and the new migration. Reviewer: APPROVE, three info-only findings (type-user display gap, framework-rule calibration table doc, catch of the recurring pr19 pattern). Suite: 1555 / 34 / 0 (+18 new edge-confidence tests). Eval: 11/11 passed | recall=1.00 | mrr=0.86 (within regression budget vs 091e935 baseline). Phase 3 done. Backlog now down to: #6 TOON (deferred), colbymchenry#11 trace_to_culprits, colbymchenry#17/colbymchenry#18 Phase 7, colbymchenry#19 streaming. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 4, 2026
…sites (colbymchenry#11) Last Phase 1-6 backlog item. Replaces the "agent stitches codegraph_callers + _history + _biomarkers manually for every stack frame" loop with one tool: paste a stack trace, get a ranked list of likely fix-site candidates with per-row "why suspected" reasons. ## Pipeline 1. `parseTrace(text)` — multi-format extractor: V8/Node, Python, Java/Kotlin/Scala, Go, Rust, plus a generic `path/to/file.ext:line` fallback. Caps at 200 frames; dedupes by `(file, line)`; normalises Windows backslashes; first-pattern-per-line wins. 2. `resolveTraceFile` — longest-suffix match between absolute trace paths and project-relative indexed paths, with a path-boundary check so `foo/bar.ts` doesn't match `sub-foo/bar.ts`. 3. `findEnclosingNode` (existing helper from index-hooks/enclosing.ts, shared with codegraph_grep) attributes each frame to its enclosing function/method/class via line number. 4. `scoreCulprits` composite: `1 / (topRank + 1)` for frame position (top of stack wins decisively), `+0.3` for risk-biomarker overlap (god_class / complex_method / large_method / nested_complexity), `+0.2` for recent file churn (touched within 30 days, with commit count surfaced in the reason). 5. Output: ranked candidates with per-row reasons + an "unmapped frames" footer (capped at 5 samples) so the agent sees what couldn't be attributed (`node_modules/`, generated files, etc.). ## CLI mirror `codegraph trace-to-culprits` reads from stdin (`cat error.log | codegraph trace-to-culprits`) or from `--trace "..."`. Routes via runViaMCP per the repo convention. ## Reviewer cycle Round 1 returned BLOCK + REQUEST_CHANGES + info — all addressed: - **R1 (BLOCK, correctness)**: `recentCutoff` was computed in ms but `last_touched_ts` is unix seconds — the `>=` comparison always evaluated false, killing the churn signal silently on every project. Fixed with `Math.floor(Date.now() / 1000)` and a new R1-regression-guard test that stamps a fresh `last_touched_ts` via the same `applyChurnDeltas` helper the churn miner uses, then asserts the rendered output mentions "recent churn". - **R2 (REQUEST_CHANGES, correctness)**: inline `cg.queries.db .prepare(...)` violated the per-domain query-file convention and re-prepared the same statement per culprit. Switched to `getFileByPath` from queries-files.ts (cached statement, returns FileRecord with `lastTouchedTs` and `commitCount`). - **R3 (info, perf)**: `resolveTraceFile` ran `getAllFiles` once per frame — N full-table scans for an N-frame trace. Hoisted to once-per-call at the top of `buildCulprits`. ## Tests 13 trace tests in `__tests__/mcp-trace-to-culprits.test.ts`: - parseTrace: V8 / Python / Java / Go / dedup / 200-frame cap / no-frames / Windows backslash normalisation - End-to-end: V8 trace → enclosing symbols ranked top-of-stack first, unmapped-frames footer, no-refs error message, limit cap, R1-regression guard for the churn signal CLI dogfood: `cat trace | npm run cli:dev -- trace-to-culprits` parses a real production-shaped V8 trace and maps frames to `src/mcp/tools/trace-to-culprits.ts` symbols. Suite: **1595 / 34 / 0** (+13 trace tests). Eval: 18/18 | recall=1.00 | mrr=0.91 | within budget (re-baselined after colbymchenry#11 added new files to the corpus, which shifted the explore-pipeline case rank — case still PASSes its 0.5 threshold, the relative regression was corpus drift, not behaviour change). ## Backlog after this Phase 1-6 complete. Remaining: Phase 7 (colbymchenry#17 propose_extract / propose_rename, colbymchenry#18 plan-and-execute CLI — both need user check), Phase 8 (colbymchenry#19 streaming MCP), and #6 TOON (deferred pending measurement). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 4, 2026
Backlog #6 proposed adopting TOON (Token-Oriented Object Notation, header-once / rows-as-tuples) as a smaller alternative to the current markdown formatters for tabular MCP responses. The backlog explicitly gated the decision: "Verify actual savings on captured queries before flipping default — the 30-60% claim is for ideal tabular data." This commit ships the measurement and the answer. ## Result Eight representative queries against this project's own .codegraph, covering search (long signatures), suggest (short rows), callers (20-row mixed-confidence list): sample rows md(B) toon(B) saving ────────────────────────────────────── ──── ───── ─────── ────── search "CodeGraph" 10 683 661 +3.2% search "extractFromSource" 10 1483 1444 +2.6% search "compareToRef" 3 461 468 -1.5% search "handleSearch" 6 760 757 +0.4% search "parseTrace" 2 291 298 -2.4% suggest "CodGrap" 10 408 409 -0.2% suggest "extracFromSorce" 10 386 387 -0.3% callers of extractFromSource 14 936 1037 -10.8% TOTAL: md=5408B toon=5461B aggregate saving -1.0% ## Why the 30-60% claim doesn't apply TOON's win comes from compressing a verbose JSON baseline like `[{"name":"foo","kind":"function","file":"a.ts","line":42},…]` — the "headers repeated per row" + JSON quoting waste is what its header-once shape removes. But our markdown is ALREADY row-shaped: - `### foo (function)` ≈ `foo,function` (no quoting). - `a.ts:42` ≈ `a.ts,42` (single-char delimiter, both compact). - `- name (kind) - file:line` is shorter than the comma-tuple form when names are short. There's no fat to trim. On callers (the densest row shape) TOON is 10% LARGER because the per-row `- ` bullet syntax is a one-byte overhead while TOON's commas waste two chars between every field. ## Decision Skip TOON. Empirical answer matches the backlog's gate — the savings aren't there for our markdown baseline. Adopting it would: - Add per-tool format selector code + tests. - Risk LLM-client misrender (TOON is new; some clients haven't trained on it). - Net zero to net negative on payload bytes. The measurement script (`__tests__/evaluation/toon-measure.ts`) stays in the repo as a permanent record + a re-runnable harness if the markdown formatters ever get verbose enough to flip the math. Run with `npx tsx __tests__/evaluation/toon-measure.ts`. ## Backlog #6 closed. Phase 1-6 of the agentic backlog is now fully resolved (every item shipped or empirically dismissed). Remaining: Phase 7 (colbymchenry#17 propose_extract / propose_rename, colbymchenry#18 plan-and-execute CLI), Phase 8 (colbymchenry#19 streaming MCP) — all need user check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 9, 2026
…search Subtask #6 of the embedding-features arc Stage 3. New module src/llm/query-expansion.ts that asks the configured chat backend for paraphrases of a search query and caches the result by sha256(n + ':' + text). Wired into searchHybridBlendWithEmbeddings: when the top semantic-hit cosine score < 0.5 AND a chat backend is reachable, the function expands the query into 3 paraphrases, embeds each, runs them through llmSemanticTopK, and max-pools the candidate scores across the original + expansions. Anti-goals enforced: - Returns [] on LLM error / abort / unreachable backend so callers fall back to the unexpanded path without paying a query stall. - LRU-ish cache bounded at 256 entries — bumps recency on read, evicts oldest on write at capacity. - Filters out paraphrases identical to the original (de-dup) and lines >= 500 chars (LLM hallucination guard). Threshold + paraphrase count live as module-private constants (QUERY_EXPANSION_SCORE_THRESHOLD = 0.5, _PARAPHRASE_COUNT = 3). Suite: 2136 -> 2163 (+27 tests covering input validation, LLM call shape, output parsing, error handling, caching, duplicate filter). Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 10, 2026
Adds `compact: true` to codegraph_callers + codegraph_callees inputSchema. When set, the formatNodeList renderer emits one terse line per node: Callers of foo (3) bar|function|src/x.ts:42|id:n_a1b2c3d4 baz|method|src/y.ts:120|id:n_e5f6g7h8 instead of the default markdown: ## Callers of foo (3 found) - bar (function) - src/x.ts:42 *(EXTRACTED)* (2 call sites: 12, 45) `[id: n_a1b2c3d4]` Saves ~50-70% of output tokens on graph-walk chains. Default false preserves the rich human-readable output for ad-hoc inspection. Pattern is straightforward to extend to the other navigation tools (search, refs, impact). Tracked as Stage 6 #6.1 follow-up. Suite remains 2374/0/34. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 10, 2026
New MCP tool consolidating 5-10 chained codegraph_callers/callees calls into one BFS traversal. Inputs: - start: symbol name OR n_xxxxxxxx UID - direction: 'callers' | 'callees' | 'impact' (bidirectional) - hops: 1-5, default 2 - maxNodes: 1-200, default 50 - edgeKind?: filter to one edge kind - compact?: default TRUE (this tool's purpose IS terse output) - projectPath?: standard BFS deduplicates via node_id Map, first-seen wins (shorter depth preferred). Excluded edge kinds: similar_to / def_use / contains (structural noise) on unfiltered walks. Output rows: `name|kind|path:line|depth=N|via=parent_name|id:n_xxxxxxxx`. 26 tests in __tests__/mcp-walk.test.ts covering input validation, BFS shape (hops/cycles/dedup/maxNodes cap), output format (compact vs markdown), and registration. Tool registered. Suite passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 10, 2026
…#6.2 + #6.3) Closes the inline Stage 6 deferrals. #6.2 since-tokens — codegraph_callers and codegraph_callees now support `since: 'c_xxxxxxxx'` delta-mode chains. Same pattern as codegraph_search / codegraph_grep / codegraph_explore (the applyDeltaSince + mintCallId + appendCallIdFooter helpers from shared.ts). Saves ~40-60% of output on multi-step walks where the agent already saw most of the prior result. #6.3 field projection — both tools accept a `fields: string[]` arg that restricts compact-mode rows to a subset of {name, kind, path, line, id}. Combined with `compact: true`, drops to the absolute minimum the caller needs: fields: ["name", "path"] → `findSimilarViaVec|src/db/vec-helpers.ts` Saves another ~50% on top of compact. Both flags are opt-in; default behavior unchanged. Suite remains 2433/0/34. Typecheck clean. Stage 6 token-cost reductions now complete: #6.1 compact mode + #6.2 since-tokens + #6.3 field projection + #6.4 codegraph_walk all shipped. Cumulative effect on graph-heavy sessions: 60-80% output-token reduction when all flags are set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 10, 2026
Reviewer pass on c895348..5730d98 caught 3 request_changes + 3 info, all addressed. Plus the deferred staleness gating (info #6) and playbook updates the user requested as a follow-up. ## Request_changes (real bugs) 1. **Chunk vec0 not cleared on full reset** (queries.ts + vec-helpers.ts): `clearAll` deletes nodes with FKs OFF, so the ON DELETE CASCADE on `symbol_chunk_embeddings` never fires. `clearVecTables` then only wiped `vec_symbol_embeddings_*` — leaving the chunk vec0 mirror tables full of ghost rowids that would poison the next index run's `findSimilarViaVec` chunk queries. Fixed: explicit `DELETE FROM symbol_chunk_embeddings` inside `clearAll`'s transaction; `clearVecTables` now also discovers and wipes `vec_chunk_embeddings_*` via the same LIKE pattern + regex whitelist. 2. **Docstring rot on upsertChunkEmbedding** (queries-chunk-embeddings.ts): JSDoc claimed "false when an existing row was updated" but `ON CONFLICT DO UPDATE` produces `changes=1` on both INSERT and UPDATE — the function returned true on both. Tightened JSDoc to match actual behavior; simplified the return condition to `result.changes > 0` (drops the meaningless lastInsertRowid check). 3. **codegraph_walk missing CLI mirror** (bin/codegraph.ts): the new MCP tool had no `program.command('walk')` counterpart, violating the repo's CLI/MCP alignment convention. Added a thin CLI command that delegates via runViaMCP — same shape as the other delegating subcommands. ## Info-level 4. **Deferred staleness gating shipped** (multi-vec.ts): previously re-embedded every long symbol's chunks on every sync. Now stamps the file's content_hash on each chunk row (in the previously- unused summary_hash_at_embed column, repurposed for multi-vec) and discoverLongSymbols' SELECT skips symbols whose stored hash matches the current files.content_hash. Idempotent re-syncs with no source drift now do zero embed work. 5. **parseFieldsArg duplication** (shared.ts + callers/callees): Extracted to `shared.ts` as `parseFieldsArg` + `CompactFieldName` type. Callers and callees both import the shared version. 6. **Playbook updates** (server-instructions.ts): - Added `codegraph_walk` use-case bullet under graph-walk tools - New "Token-cost flags" section documenting compact / fields / since on callers + callees - Updated `codegraph_risk_review` reference to the merged `codegraph_review({mode: 'risk'})` form Skipped (genuine non-issues): the `since` delta filter on callers not covering typeUsers (typeUsers is a small distinct section by design, not part of the same-rank dedup); CRLF handling in sliceNodeBody (would require explicit \\r\\n splitting; deferred until Windows operators report regressions). Suite remains 2433/0/34. Typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv
added a commit
that referenced
this pull request
May 11, 2026
…olved-by-FTS Three more friction closures from the 2026-05-11 workout: - src/extraction/index.ts + src/db/queries-files.ts: heal-marked files now propagate through the git fast-path. The EXTRACTION_LOGIC_VERSION self-heal sets needs_reextract=1 on every file, but the git fast-path eoCollectGitChanges only iterated files git reported as changed — so heal-marked files with no diff were silently skipped in any git-tracked project. Adds a getFilesNeedingReextract(qb) query and unions the result into the change set after the git fast-path returns. Full-scan path was already correct via eoClassifyFileChange; no change there. Closes friction colbymchenry#22 (surfaced while validating the cluster fix in 6957b74 — the original needs_reextract=1 → 620 of 627 stayed pending after sync, requiring admin index --force). - src/mcp/tools/entry-points.ts: new "MCP tools" bucket scans for exported `*_TOOL` constants whose initializer signature contains `name: 'codegraph_*'` and surfaces them with the parsed canonical tool name. Closes friction colbymchenry#13 / cluster symptom #6: every MCP tool dispatcher previously fell through every bucket because routes/CLI commands/CLI-files/public-exports each had a different filter mismatch. Five buckets now (HTTP routes, CLI commands, MCP tools, CLI files, public exports). __tests__/mcp-entry-points grows 5 → 6 tests with the new bucket fixture. - Friction colbymchenry#9 (semantic/intent search can't bridge tool-name → handler-name): resolved by existing FTS5 path. Investigation showed nodes_fts indexes the `signature` column at bm25=2, so searching `codegraph_X` already returns the matching XXX_TOOL constant via signature substring match. The morning-workout "No results" observation was likely a stale-index transient (pre-cluster-fix). Added a regression test in mcp-search-family to lock in the behavior. Test impact: 2633 → 2635 (one new MCP-tools-bucket test, one new canonical-tool-name regression test). Type-check clean. Closes: - colbymchenry#22 EXTRACTION_LOGIC_VERSION heal flag honored - colbymchenry#13 entry_points 5th bucket for MCP tool dispatchers - colbymchenry#9 already-resolved-by-FTS — locked in with regression test Still open: - colbymchenry#8 / cluster #1 (context whiff): graph edge now exists, but the context-builder ranker still surfaces ToolHandler/ToolResult/ ToolCtx instead of dispatchers. Needs a context-ranker boost for `codegraph_*` queries. Logged as colbymchenry#24 follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stress-test fixes — PR 4 of 5 (stacked). Base:
pr3-split-provider-and-summary-batching.Commits
1e5fa19fix: GC orphan summary rows after each summarize pass9adac63fix: prune orphan directory_summaries; WASM adapter savepoint nestingSummary
Symbol-summary GC (
1e5fa19)Surfaced after running
--forcewith two different chat models on the same repo: symbol_summaries held 723 rows — 675 fresh + 48 stale from the prior model whose node_ids no longer exist. Cache-preservation in PR 2 left them behind; the by-content-hash fallback didn't reclaim them onto current node_ids.pruneOrphanSummaries()runsDELETE FROM symbol_summaries WHERE node_id NOT IN (SELECT id FROM nodes)inside a transaction. Embedding rows cascade-delete via existing FK. Wired AFTERsummarizeAll(so the content-hash reclaim has had its chance) and BEFORE embedding (so embeddings only run on live rows).Directory-summary GC (
9adac63)Same issue for
directory_summaries. Live-dir set computed viapath.posix.dirname(filePath)— exactly mirroring whatdir-summarizer.tswrites. Crucially does NOT walk the ancestor chain (the first draft did, which let stale intermediate-dir summaries survive — e.g. asrcsummary from when files lived directly insrc/but have since all moved intosrc/core/). Regression-tested.WASM savepoint nesting (
9adac63)WasmDatabaseAdapterwas using bareBEGIN/COMMITfor nested transactions, which sqlite-wasm rejects. Switched to savepoint-based nesting. Also exported the adapter from the package index so callers can construct it directly (the only way to exercise WASM on Mac, wherecreateDatabasealways picksbetter-sqlite3).Test plan
__tests__/migrations-022.test.tscover live-preserved + orphan-pruned + ancestor-only-pruned scenarios.WIP draft.