You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Maintainer guide: how to merge the open PR backlog with minimal friction
Update — April 28, 2026: 14 of my PRs have been pre-rebased onto the post-refactor patterns. The agent-callable code-intelligence layer is fleshing out: #124 (per-symbol coverage from lcov reports) and #125 (static-analysis biomarkers + Code Health score) add two new MCP tools so an AI agent can ask "is this function tested?" and "is this function risky to change?" with a single structured query, instead of reading the source. Both stack on #118 (file-based migrations); #125 also runs as an IndexHook. #122 remains a tiny self-contained perf win in monolithic shape that targets main directly. See the rebase summary comment for the full list. The rebase templates below remain authoritative documentation for external PRs (#58, #91, #89) and for any future PR that needs to land on top of the refactors.
There are ~24 open PRs touching colbymchenry/codegraph. Many of them collide on the same handful of monolithic data structures (lists, switches, version numbers, hardcoded method additions). I've opened 4 refactor PRs that eliminate those collision surfaces. This document tells you the order to merge things in to minimize manual conflict-resolution work, and shows you exactly what the rebase of every existing open PR looks like.
Easiest path for the maintainer: 12 clean merges in 4 rounds
I walked this path end-to-end against a fresh colbymchenry:main checkout three times to confirm — twice with intermediate rebases that broke things in subtle ways, then once cleanly. 12 explicit merges, zero hand-resolution. 829/829 tests pass at the tip. Final state matches andreinknv:integration/all-prs (22 MCP tools, 8 hooks, 18 migrations).
Round 1 — five PRs clean against current main, any order
per-symbol coverage. This PR's branch carries the entire Phase-4 chain as its base (#102, #105, #106, #110, #111, #112-port, #113, #114, #115, #122-port, #123) plus the 5 Phase-1 fixes that touch src/extraction/index.ts (#93, #100, #101, #103, #104) plus #96, #97, #98, #99, #107, #108, #109. One merge → 18 PRs land as ancestors, each preserved as a merge commit in #124's history.
Round 4 — three PRs, in this order
Order matters for Round 4: #125 → #126 → #121. The reason: #125 is now stacked on feat/coverage, so it fast-forwards over #124's anchor commit cleanly. #126 and #121 have non-overlapping changes if #125 is in first.
These 21 PRs have their content already on main via Round 3's ancestor chain. Each can be closed with a one-line "merged via #124's ancestor chain — see #120" comment, or you can gh pr merge each individually for per-PR attribution badges (most will be Already up to date):
Two PRs land as content-only (no branch-tip ancestor):
feat(graph): PageRank centrality + per-file churn metrics #112 (centrality+churn) — content present via post-refactor port commit 38887ee. The original branch tip is on a different SHA chain. Use git merge -X theirs feat/centrality-and-churn for clean attribution, or close noting that the post-refactor port is already in.
External-contributor PRs not part of this path:#52, #57, #58, #66, #80, #89, #91 — these need the original contributor to rebase or accept a co-authored rebase. Templates for that work below.
Alternative path: full per-PR attribution (no anchor)
If you want every PR branch to be a true git ancestor (so git log --merges shows all 35 PRs as separate merge commits, including #112), skip #124 as anchor and merge each PR directly. Same end-state (829/829 tests), more merge commits, ~9 hand-resolutions on shared files.
Easiest way: run the script
A bash script automates the entire 21-merge sequence with auto-resolution. Maintainer's effort drops to ~3 minutes.
git switch -c merge/all-prs colbymchenry/main # or wherever your main is
curl -fsSL https://raw.githubusercontent.com/andreinknv/codegraph/tools/merge-helper/scripts/merge-all-prs.sh | bash
npm run build && npm test# expect 829/829
git push origin merge/all-prs # then click "Merge into main" once
The script:
Adds the andreinknv remote if missing, fetches all PR branches.
Walks Step A → J in order, doing git merge --no-ff for each PR.
For the 9 PRs with shared-file conflicts, automatically runs git checkout andreinknv/integration/all-prs -- <conflicted files> and commits.
Runs the final tools.ts dedup step after Step J.
Refuses to start with uncommitted changes or a merge in progress (idempotent: reset and re-run).
Phase-4 PRs (#105, #106, #111, #112, #113, #114, #115) genuinely share src/db/queries.ts, src/db/schema.sql, src/index.ts, and src/mcp/tools.ts — each adds methods/tables/handlers to the same shared resources. Their pairwise conflicts are real semantic overlaps, not branch-divergence artifacts. The script above resolves each by checking out the integration branch's pre-validated combined version, which is why it just works.
Three PR branches were rebased to make this path tractable:#121 onto post-#117, #122 to file-based-migration form, #125 stacked on #124. (#93 and #110 were reverted to their original SHAs after testing showed the rebases broke their "ancestor of #124's chain" property — the close-out story matters more than the per-PR clean merge for those two.)
Sequence (verified end-to-end)
I walked this against a fresh colbymchenry:main. Counts below are exact files-conflicted.
#93 hand-resolves on 3 files (src/types.ts, src/extraction/index.ts, __tests__/sync.test.ts) — git checkout integration/all-prs -- <files>. #100, #101, #103, #104 then become clean (their conflict was with #93's pre-rebase form, not pairwise).
hand-resolves on 2 files (src/index.ts, src/mcp/tools.ts). Same shortcut.
F
#102 clean ✓ ; #122 hand-resolves on 3 files (__tests__/foundation.test.ts, __tests__/pr19-improvements.test.ts, src/db/migrations/index.ts) — its file-based port adds migration 017 which conflicts with the registry index expecting it not to be there yet.
#112 hand-resolves on 12 files (the biggest single conflict). After this lands, #113, #114, #115 become ancestors automatically — no merge needed for them.
Final cleanup step (caveat the doc previously missed): the iterative git checkout integration/all-prs -- <file> approach can leave duplicate symbols in src/mcp/tools.ts (the same handler/import added by multiple steps' integration files). Run one final git checkout integration/all-prs -- src/mcp/tools.ts after Step J to dedupe. Tests pass before the cleanup; only tsc complains.
829/829 tests pass at the tip (94 commits on upstream/main)
Same final tree as integration/all-prs
Why the easy path is still the default recommendation: the anchor approach (12 clean merges) gets the maintainer to the same green-test state in ~5 minutes of click-merge work. The no-anchor alternative is ~30-45 minutes of git checkout integration/all-prs -- <file> invocations for marginal benefit on attribution badges. Only worth it if those badges matter to you for #112 specifically.
The merge order, by phase
The phases below are the conceptual structure the rest of this document references — every dependency table, every rebase template assumes them. The "Easiest path" above collapses these phases into 4 rounds; this list shows the underlying grouping.
Phase 1: easy wins (any order, no shared state)
#109 watcher flake fix
#96 docs: non-Claude clients
#97 docs: language cookbook
#93 git submodules
#98 correctness bugs
#99 defense-in-depth hardening
#100 HEAD-move sync detection
#101 extraction/resolution accuracy
#103 .codegraphignore on git fast path
#104 FTS stemming/subwords
#107 search per-file diversification
#108 perf db cache fix
#110 review-context MCP tool (requires #117 first; already rebased onto the post-#117 pattern)
Phase 2: the 4 unblockers (any order, fully independent of each other)
#116 refactor: language registry
#117 refactor: MCP tool registry
#118 refactor: file-based migrations
#119 refactor: index-hook framework
Phase 3: language PRs — each rebases onto #116 (1-file changes)
#92 HCL/Terraform
#94 R
#95 SQL
#91 Scala (firehooper)
#66 Vue (abhijeetj100)
#58 ReScript (malo)
#57 mql5 (cfournel)
#80 pgsql similarity search (bhushan)
#89 framework route extraction (timomeara)
Phase 4: schema-touching feature PRs — rebase onto #118 + #117 + #119
#102 UNIQUE on edges
#105 cochange-graph
#111 LLM features (4 migrations)
#112 centrality + churn
#113 issue-history
#114 config-refs
#115 sql-refs
#122 perf: drop redundant edge indexes (still in monolithic shape — apply Phase 4 cochange template)
#123 perf: split embeddings + in-memory cache + drop redundant co_changes index (already in file-based shape; depends on #105 + #111)
#124 feat: per-symbol coverage from lcov + codegraph_coverage MCP tool (file-based migration; depends on #118)
#125 feat: static-analysis biomarkers + Code Health + codegraph_biomarkers MCP tool (file-based migration + IndexHook; depends on #118 + #119)
Phase 5: misc
#52 Cursor IDE installer
The four refactors in Phase 2 don't conflict with each other and can be reviewed and merged in any order. Once they're in, every remaining open PR becomes a small, focused, mostly-additive diff.
Why these 4 refactors exist
Every "feature" PR currently fights one of four shared bottlenecks. Each refactor turns the bottleneck into a "drop one file" pattern.
src/db/migrations.tsCURRENT_SCHEMA_VERSION constant + monolithic migrations[] array (and the silent-data-loss bug class where two PRs claiming the same v4 silently no-op the second one)
Behavior-preserving changes only (no migrations, no API breaks).
Backward-compat re-exports so existing imports keep working unchanged.
Structural-invariant tests that fail loudly if a future PR drifts from the registry pattern.
A reviewer pass with at least one bug caught and fixed.
Phase 1: 13 PRs that don't conflict with anything
These are small fixes, docs, or features that don't touch the bottleneck files. Merge in any order. One caveat: #110 requires #117 first (it adds an MCP tool, so it must land after the MCP tool registry refactor). #110 is already pre-rebased onto that pattern, so once #117 merges, #110 needs no further rebase work — that's why it appears in the Phase 1 list (no contributor action required).
They don't conflict with each other. Pick any order. Each is a self-contained refactor with no behavior change, full backward compat, and structural-invariant tests.
Preview branch — what main looks like after all 4 land: andreinknv:preview/all-four-refactors-merged — branched from colbymchenry:main, the four refactor branches merged in sequence (#117 → #118 → #119 → #116). 417/417 tests pass at the tip (380 baseline + 37 new structural-invariant tests across the four refactors). Each refactor PR also has a maintainer review checklist pinned at the top of its description.
Recommended review order if you want to validate one at a time:
refactor: file-based migrations — eliminate silent version-collision bugs #118 (migrations) is the highest-impact safety win — it eliminates a silent-data-loss bug class. Two PRs both claiming v4 used to result in "second one's migration silently no-ops on existing DBs." Now their filenames collide on the filesystem instantly. Reviewer caught a hand-typed-version-vs-filename drift bug; fixed by parsing version from filename.
refactor: per-tool MCP registry — eliminate tools[] + case-switch conflicts #117 (MCP tools) is a partial extraction — tools[] array and case switch are gone, but handler bodies still live in ToolHandler. Adding a new tool drops from a 4-way conflict to a 1-way conflict (a method addition). Full body extraction is a follow-up.
Every open language PR currently edits 6 monolithic lists across src/types.ts, src/extraction/grammars.ts, CLAUDE.md, and __tests__/extraction.test.ts. After #116, the rebase is mechanical:
git checkout feat/hcl-terraform-support
git rebase main # post-#116 main# You will hit conflicts on every monolithic file. Resolve by DELETING# the existing additions and creating one new file instead:
Step 2. Create one file at src/extraction/languages/hcl.ts:
import{hclExtractor}from'./hcl-extractor-config';// Or reuse your existing extractor fileimporttype{LanguageDef}from'./types';exportconstHCL_DEF: LanguageDef={name: 'hcl',displayName: 'HCL / Terraform',extensions: ['.tf','.tfvars','.hcl'],includeGlobs: ['**/*.tf','**/*.tfvars','**/*.hcl'],// For grammar-backed languages:// grammar: { wasmFile: 'tree-sitter-hcl.wasm', vendored: true, extractor: hclExtractor },// For custom-extractor languages (HCL has its own extractor):customExtractor: (filePath,source)=>newHclExtractor(filePath,source).extract(),};
Step 3. Add 2 lines to src/extraction/languages/registry.ts:
import{HCL_DEF}from'./hcl';// alphabetical: between c-cpp and impact// ...constALL_DEFS: readonlyLanguageDef[]=[// ...HCL_DEF,// ...];
Step 4. (Optional) If HCL needs language-specific narrowing in src/resolution/index.ts or src/resolution/import-resolver.ts, add 'hcl' to the Language union in src/types.ts. Most languages don't need this.
Step 5. Keep your test additions in their original __tests__/extraction.test.ts describe block — those don't conflict with anything (the language registry refactor doesn't touch tests).
The diff size goes from "6 files modified" to "1 new file + 2 lines in registry.ts" (+ optional 1 line in Language union).
Same template applies to #94 R, #95 SQL, #91 Scala, #66 Vue, #58 ReScript, #57 mql5. Each language gets its own file; no two languages conflict if their alphabetical positions in registry.ts aren't adjacent.
#80 pgsql similarity and #89 framework route extraction aren't language PRs — they don't need this template. Land them in Phase 1.
Phase 4: schema-touching feature PRs
Pre-requisite: #118 must be merged first (and ideally #117 + #119 too, for PRs that also touch MCP tools or sync hooks).
⏳ perf(db): drop redundant idx_edges_source and idx_edges_target #122 (perf: drop redundant edge indexes) — currently in monolithic migrations.ts shape (version: 4). Targets main directly and is mergeable today, but once refactor: file-based migrations — eliminate silent version-collision bugs #118 lands, apply the Phase 4 cochange template to convert: move the body to src/db/migrations/<NNN>-drop-redundant-edge-indexes.ts, register it, drop the CURRENT_SCHEMA_VERSION bump (now auto-derived). Empirically validated via scripts/spikes/spike-edge-indexes.mjs: -22% DB size, 1.37× faster bulk insert, no query regression (EXPLAIN-confirmed covering scan via idx_edges_source_kind).
✅ feat(coverage): per-symbol code coverage from lcov reports + codegraph_coverage MCP tool #124 (feat: per-symbol coverage from lcov) — already in post-refactor file-based shape: migration 018-node-coverage.ts, src/coverage/{lcov,index}.ts parser + ingestion, codegraph_coverage MCP tool (modes: symbol/ranked/stats), codegraph coverage <report> CLI. Lifts centrality > 0.5 AND coverage = 0 into a single agent-callable query — high-impact untested code in one row. Independent reviewer pass caught and fixed: Istanbul -1 excluded-line sentinel inflated totalLines, the SQL pct could be NULL where the TS type said number, monorepo-relative report paths failed to match. Two follow-up fix commits added (April 28): (a) src/coverage/{index,lcov}.ts were never committed in the original PR because of an unanchored coverage/.gitignore rule that silently swallowed them — fixed by anchoring the rule to /coverage/ (also filed as standalone PR fix(gitignore): anchor "coverage/" to repo root (unblocks src/coverage/, packages/*/coverage/) #127) and adding the missing module; (b) getCoverageStats(source) returned all sources in sources[] regardless of the filter — narrowed via WHERE clause + regression test.
✅ feat(biomarkers): static-analysis Code Health findings + codegraph_biomarkers MCP tool #125 (feat: static-analysis biomarkers + Code Health) — already in post-refactor shape: migration 019-code-health-findings.ts, src/biomarkers/ engine (5 biomarkers: Large Method, Complex Method, Nested Complexity, Complex Conditional, Brain Method), IndexHook that runs after every indexAll/sync, codegraph_biomarkers MCP tool. Languages covered: TS/JS, Python, Go, Java (others skipped gracefully). Independent reviewer pass caught and fixed: stale findings persisted forever when a function was refactored clean, countConditionalOperands inflated counts via inner lambdas, else_clause over-counted cyclomatic vs standard McCabe. Smoke-tested on codegraph itself: 397 findings on 198 symbols; top hotspot visitNode in tree-sitter.ts (Brain Method, cyclomatic 48, nesting 15, Code Health 3/10).
The template below remains the canonical "how to add a schema-touching feature on top of refactors" reference.
version: 4 entry to migrations[] in src/db/migrations.ts
Bumps CURRENT_SCHEMA_VERSION to 4
A runDerivedSignals-style hook into CodeGraph.indexAll and sync
New tables to src/db/schema.sql
A new src/cochange/index.ts module
After Phase 2 lands, the rebase becomes:
Step 1. Move the migration body to its own file:
# Find the next free 3-digit prefix:
ls src/db/migrations/[0-9]*.ts
# 002-project-metadata.ts# 003-lower-name-index.ts# → next free is 004
Create src/db/migrations/004-cochange-graph.ts:
importtype{MigrationModule}from'./types';exportconstMIGRATION: MigrationModule={description: 'Add co-change graph: per-file commit_count + co_changes table',up: (db)=>{// ... your existing migration body, copied verbatim ...constcols=db.prepare(`PRAGMA table_info(files);`).all()asArray<{name: string}>;if(!cols.some((c)=>c.name==='commit_count')){db.exec(`ALTER TABLE files ADD COLUMN commit_count INTEGER NOT NULL DEFAULT 0;`);}db.exec(` CREATE TABLE IF NOT EXISTS co_changes ( file_a TEXT NOT NULL, file_b TEXT NOT NULL, count INTEGER NOT NULL, PRIMARY KEY (file_a, file_b), CHECK (file_a < file_b) ); CREATE INDEX IF NOT EXISTS idx_co_changes_a ON co_changes(file_a); CREATE INDEX IF NOT EXISTS idx_co_changes_b ON co_changes(file_b); `);},};
Step 2. Register it in src/db/migrations/index.ts:
Step 3.Drop the CURRENT_SCHEMA_VERSION bump from your PR. It's auto-derived from the registry now.
Step 4. Move the runDerivedSignals integration to a hook file src/index-hooks/cochange.ts:
importtype{IndexHook,IndexHookContext}from'./registry';importtype{SyncResult}from'../extraction';import{mineCoChanges,applyCoChangeDeltas, ... }from'../cochange';import{logDebug}from'../errors';exportconstHOOK: IndexHook={name: 'cochange',asyncafterIndexAll(ctx){if(ctx.config.enableCoChange===false)return;// Move the body of your old runCoChangePass here.// Use ctx.queries / ctx.projectRoot / ctx.config.},asyncafterSync(ctx,result){if(ctx.config.enableCoChange===false)return;// Same, but for incremental sync.},};
Step 5. Register the hook in src/index-hooks/registry.ts:
Step 4. Add async handleHotspots(args: Record<string, unknown>): Promise<ToolResult> { ... } on ToolHandler in src/mcp/tools.ts. The case switch is gone — execute() finds your handler via the registry.
The ToolHandlerimplements ToolHandlerLike constraint catches the case where you add a key to HandlerKey but forget to implement the method (compile-time error).
Validation steps (apply after every PR lands)
npm run build # tsc + asset copy
npm test# full suite — no flake on CI; the watcher test is the one known flake (PR #109 fixes it)# After #116 (language registry):
node -e "console.log(require('./dist').getSupportedLanguages())"
node -e "console.log(require('./dist/types').DEFAULT_CONFIG.include.length)"# After #118 (migrations):
node -e " const { CURRENT_SCHEMA_VERSION, ALL_MIGRATIONS } = require('./dist/db/migrations'); console.log('Versions:', ALL_MIGRATIONS.map(m => m.version)); console.log('Current:', CURRENT_SCHEMA_VERSION);"# After #117 (MCP tools):
node -e "console.log(require('./dist/mcp/tools/registry').getToolModules().map(m => m.definition.name))"
Validation (April 28, 2026): all 30 mergeable PRs stacked together as andreinknv:integration/all-prs — built fresh from colbymchenry:main via 18 sequential merges. 829/829 tests pass, 8 index-hooks fire, 22 MCP tools register, 18 migrations apply cleanly.indexAll + sync round-trip on 4 real codebases (codegraph itself, asf-platform, asf-dashboard, mcp-obsidian-extended).
Of the 35 PR branches: 33 are direct ancestors of integration/all-prs via merge commits. The 2 exceptions — #112 (centrality+churn) and #122 (drop-redundant-edge-indexes) — have their content present via post-refactor port commits in the integration history but their original branch tips aren't ancestors. This is intentional: #122 is documented above as remaining in monolithic shape targeting main directly, and #112 was rebased onto post-refactor patterns as a separate commit chain. Both PRs' code is exercised by the test suite via the integration.
Findings from rebuilding the integration from scratch (worth knowing for the merge order):
Phase-4 PRs whose branches were rebased onto e1b0595-shape (the user's Phase-4 base) can mostly be merged via merging any one of feat/coverage / feat/biomarkers / feat/mcp-server-instructions — that single merge brings in the rest of the Phase-4 stack as ancestors. This is the practical shortcut the maintainer can take.
(Earlier integration runs measured 757/757 tests + 13 tools + 7 hooks; the deltas are #124, #125, #126, #121, plus the 16 invariant tests added by #116's reviewer pass.)
Maintainer guide: how to merge the open PR backlog with minimal friction
There are ~24 open PRs touching
colbymchenry/codegraph. Many of them collide on the same handful of monolithic data structures (lists, switches, version numbers, hardcoded method additions). I've opened 4 refactor PRs that eliminate those collision surfaces. This document tells you the order to merge things in to minimize manual conflict-resolution work, and shows you exactly what the rebase of every existing open PR looks like.Easiest path for the maintainer: 12 clean merges in 4 rounds
I walked this path end-to-end against a fresh
colbymchenry:maincheckout three times to confirm — twice with intermediate rebases that broke things in subtle ways, then once cleanly. 12 explicit merges, zero hand-resolution. 829/829 tests pass at the tip. Final state matchesandreinknv:integration/all-prs(22 MCP tools, 8 hooks, 18 migrations).Round 1 — five PRs clean against current
main, any ordercoverage/to repo rootRound 2 — three PRs clean once Round 1 is in
Round 3 — one anchor merge that brings in 18 PRs at once
src/extraction/index.ts(#93, #100, #101, #103, #104) plus #96, #97, #98, #99, #107, #108, #109. One merge → 18 PRs land as ancestors, each preserved as a merge commit in #124's history.Round 4 — three PRs, in this order
Order matters for Round 4: #125 → #126 → #121. The reason: #125 is now stacked on
feat/coverage, so it fast-forwards over #124's anchor commit cleanly. #126 and #121 have non-overlapping changes if #125 is in first.feat/coverage+ one biomarkers commit)tools.tsandtools/status.tsin non-overlapping ways)After Round 4 — closing the rest
These 21 PRs have their content already on
mainvia Round 3's ancestor chain. Each can be closed with a one-line "merged via #124's ancestor chain — see #120" comment, or you cangh pr mergeeach individually for per-PR attribution badges (most will beAlready up to date):Two PRs land as content-only (no branch-tip ancestor):
38887ee. The original branch tip is on a different SHA chain. Usegit merge -X theirs feat/centrality-and-churnfor clean attribution, or close noting that the post-refactor port is already in.017-drop-redundant-edge-indexes.ts(the file-based port that's been rebased onto post-refactor: file-based migrations — eliminate silent version-collision bugs #118). Same handling as feat(graph): PageRank centrality + per-file churn metrics #112, or mergeperf/drop-redundant-edge-indexesdirectly which is now in file-based-migration form.External-contributor PRs not part of this path: #52, #57, #58, #66, #80, #89, #91 — these need the original contributor to rebase or accept a co-authored rebase. Templates for that work below.
Alternative path: full per-PR attribution (no anchor)
If you want every PR branch to be a true
gitancestor (sogit log --mergesshows all 35 PRs as separate merge commits, including #112), skip #124 as anchor and merge each PR directly. Same end-state (829/829 tests), more merge commits, ~9 hand-resolutions on shared files.Easiest way: run the script
A bash script automates the entire 21-merge sequence with auto-resolution. Maintainer's effort drops to ~3 minutes.
The script:
andreinknvremote if missing, fetches all PR branches.git merge --no-fffor each PR.git checkout andreinknv/integration/all-prs -- <conflicted files>and commits.tools.tsdedup step after Step J.Source:
andreinknv:tools/merge-helper:scripts/merge-all-prs.sh— verified end-to-end on a freshcolbymchenry:maincheckout, produces 35/35 ancestors and the same final tree asintegration/all-prs.Why this is harder than the easy path (manual)
Phase-4 PRs (#105, #106, #111, #112, #113, #114, #115) genuinely share
src/db/queries.ts,src/db/schema.sql,src/index.ts, andsrc/mcp/tools.ts— each adds methods/tables/handlers to the same shared resources. Their pairwise conflicts are real semantic overlaps, not branch-divergence artifacts. The script above resolves each by checking out the integration branch's pre-validated combined version, which is why it just works.Three PR branches were rebased to make this path tractable: #121 onto post-#117, #122 to file-based-migration form, #125 stacked on #124. (#93 and #110 were reverted to their original SHAs after testing showed the rebases broke their "ancestor of #124's chain" property — the close-out story matters more than the per-PR clean merge for those two.)
Sequence (verified end-to-end)
I walked this against a fresh
colbymchenry:main. Counts below are exact files-conflicted.src/types.ts,src/extraction/index.ts,__tests__/sync.test.ts) —git checkout integration/all-prs -- <files>. #100, #101, #103, #104 then become clean (their conflict was with #93's pre-rebase form, not pairwise).src/index.ts,src/mcp/tools.ts). Same shortcut.__tests__/foundation.test.ts,__tests__/pr19-improvements.test.ts,src/db/migrations/index.ts) — its file-based port adds migration017which conflicts with the registry index expecting it not to be there yet.Final cleanup step (caveat the doc previously missed): the iterative
git checkout integration/all-prs -- <file>approach can leave duplicate symbols insrc/mcp/tools.ts(the same handler/import added by multiple steps' integration files). Run one finalgit checkout integration/all-prs -- src/mcp/tools.tsafter Step J to dedupe. Tests pass before the cleanup; onlytsccomplains.End-state (verified):
gitancestors ofmainupstream/main)integration/all-prsWhy the easy path is still the default recommendation: the anchor approach (12 clean merges) gets the maintainer to the same green-test state in ~5 minutes of click-merge work. The no-anchor alternative is ~30-45 minutes of
git checkout integration/all-prs -- <file>invocations for marginal benefit on attribution badges. Only worth it if those badges matter to you for #112 specifically.The merge order, by phase
The phases below are the conceptual structure the rest of this document references — every dependency table, every rebase template assumes them. The "Easiest path" above collapses these phases into 4 rounds; this list shows the underlying grouping.
The four refactors in Phase 2 don't conflict with each other and can be reviewed and merged in any order. Once they're in, every remaining open PR becomes a small, focused, mostly-additive diff.
Why these 4 refactors exist
Every "feature" PR currently fights one of four shared bottlenecks. Each refactor turns the bottleneck into a "drop one file" pattern.
src/types.tsLanguageunion +DEFAULT_CONFIG.include+src/extraction/grammars.ts(WASM_GRAMMAR_FILES,EXTENSION_MAP,getLanguageDisplayName) +src/extraction/languages/index.tsEXTRACTORSmap +src/extraction/tree-sitter.tsextractor dispatchsrc/mcp/tools.tstools[]array +caseswitch inexecute()src/db/migrations.tsCURRENT_SCHEMA_VERSIONconstant + monolithicmigrations[]array (and the silent-data-loss bug class where two PRs claiming the same v4 silently no-op the second one)CodeGraph.runDerivedSignals/runIssueHistoryPass/runConfigRefsPass/runSqlRefsPassprivate methods + their call sites inindexAllandsyncEach refactor PR ships with:
Phase 1: 13 PRs that don't conflict with anything
These are small fixes, docs, or features that don't touch the bottleneck files. Merge in any order. One caveat: #110 requires #117 first (it adds an MCP tool, so it must land after the MCP tool registry refactor). #110 is already pre-rebased onto that pattern, so once #117 merges, #110 needs no further rebase work — that's why it appears in the Phase 1 list (no contributor action required).
fs.watchflake in__tests__/watcher.test.ts.codegraphignoreon git fast pathsrc/db/queries.tsonlycodegraph_review_contextMCP toolPhase 2: the 4 refactors
Open PRs:
refactor: per-language registry — eliminate cross-PR conflict surface for language additionsrefactor: per-tool MCP registry — eliminate tools[] + case-switch conflictsrefactor: file-based migrations — eliminate silent version-collision bugsrefactor: index-hook framework — eliminate per-pass CodeGraph mutationsThey don't conflict with each other. Pick any order. Each is a self-contained refactor with no behavior change, full backward compat, and structural-invariant tests.
Recommended review order if you want to validate one at a time:
refactor: per-language registry — eliminate cross-PR conflict surface for language additions #116 (language registry) is the easiest to mentally verify — the language list moves into per-file definitions;
EXTRACTORSbecomes derived from the registry. 16 invariant tests pin down the equivalence.refactor: file-based migrations — eliminate silent version-collision bugs #118 (migrations) is the highest-impact safety win — it eliminates a silent-data-loss bug class. Two PRs both claiming v4 used to result in "second one's migration silently no-ops on existing DBs." Now their filenames collide on the filesystem instantly. Reviewer caught a hand-typed-version-vs-filename drift bug; fixed by parsing version from filename.
refactor: per-tool MCP registry — eliminate tools[] + case-switch conflicts #117 (MCP tools) is a partial extraction —
tools[]array andcaseswitch are gone, but handler bodies still live inToolHandler. Adding a new tool drops from a 4-way conflict to a 1-way conflict (a method addition). Full body extraction is a follow-up.refactor: index-hook framework — eliminate per-pass CodeGraph mutations #119 (index hooks) ships with zero registered hooks — it's pure scaffolding for feat(cochange): file-level co-change graph mined from git history #105 / feat(graph): PageRank centrality + per-file churn metrics #112-feat(graph): sql_refs — SQL string-literal call sites (pairs with #95) #115. The framework runs no-ops on main; it only earns its keep when those PRs rebase onto it.
After all four land, Phases 3-5 become much simpler.
Phase 3: language PRs — every rebase looks the same
Pre-requisite: #116 must be merged first.
Rebase status:
LanguageDefpattern. Branches force-pushed; mergeable as soon as refactor: per-language registry — eliminate cross-PR conflict surface for language additions #116 lands.Every open language PR currently edits 6 monolithic lists across
src/types.ts,src/extraction/grammars.ts,CLAUDE.md, and__tests__/extraction.test.ts. After #116, the rebase is mechanical:Rebase template (concrete: HCL = #92)
Step 1. Discard the conflicting edits:
Step 2. Create one file at
src/extraction/languages/hcl.ts:Step 3. Add 2 lines to
src/extraction/languages/registry.ts:Step 4. (Optional) If HCL needs language-specific narrowing in
src/resolution/index.tsorsrc/resolution/import-resolver.ts, add'hcl'to theLanguageunion insrc/types.ts. Most languages don't need this.Step 5. Keep your test additions in their original
__tests__/extraction.test.tsdescribe block — those don't conflict with anything (the language registry refactor doesn't touch tests).The diff size goes from "6 files modified" to "1 new file + 2 lines in registry.ts" (+ optional 1 line in
Languageunion).Same template applies to #94 R, #95 SQL, #91 Scala, #66 Vue, #58 ReScript, #57 mql5. Each language gets its own file; no two languages conflict if their alphabetical positions in
registry.tsaren't adjacent.#80 pgsql similarity and #89 framework route extraction aren't language PRs — they don't need this template. Land them in Phase 1.
Phase 4: schema-touching feature PRs
Pre-requisite: #118 must be merged first (and ideally #117 + #119 too, for PRs that also touch MCP tools or sync hooks).
Rebase status
008-edges-unique.ts).src/index-hooks/cochange.ts+src/db/migrations/010-co-changes.ts.Fixes #Ncommits #113 (issue-history) — already rebased: 1 hook + migration 005.migrations.tsshape (version: 4). Targetsmaindirectly and is mergeable today, but once refactor: file-based migrations — eliminate silent version-collision bugs #118 lands, apply the Phase 4 cochange template to convert: move the body tosrc/db/migrations/<NNN>-drop-redundant-edge-indexes.ts, register it, drop theCURRENT_SCHEMA_VERSIONbump (now auto-derived). Empirically validated viascripts/spikes/spike-edge-indexes.mjs: -22% DB size, 1.37× faster bulk insert, no query regression (EXPLAIN-confirmed covering scan viaidx_edges_source_kind).015-prune-co-changes-index.ts+016-split-symbol-embeddings.ts, plus anEmbeddingCache+topKByCosineMatrixintegrated intosearchHybridandfindSimilar. Depends on feat(cochange): file-level co-change graph mined from git history #105 (co_changes table) and feat: LLM symbol summaries, semantic search, RAG Q&A, dir/role/dead-code/naming + agent-as-LLM bridge #111 (symbol_summaries.embedding column). Spike-validated (scripts/spikes/spike-embedding-split.mjs): 3.22× faster summary-only scans, 4.4× faster top-K cosine search. Independent reviewer pass caught and fixed: missingembeddingCache.invalidate()inclear(), duplicateidx_co_changes_bin schema.sql, sparse-array holes inEmbeddingCache.geton dim-mismatch.018-node-coverage.ts,src/coverage/{lcov,index}.tsparser + ingestion,codegraph_coverageMCP tool (modes: symbol/ranked/stats),codegraph coverage <report>CLI. Liftscentrality > 0.5 AND coverage = 0into a single agent-callable query — high-impact untested code in one row. Independent reviewer pass caught and fixed: Istanbul-1excluded-line sentinel inflatedtotalLines, the SQLpctcould be NULL where the TS type saidnumber, monorepo-relative report paths failed to match. Two follow-up fix commits added (April 28): (a)src/coverage/{index,lcov}.tswere never committed in the original PR because of an unanchoredcoverage/.gitignorerule that silently swallowed them — fixed by anchoring the rule to/coverage/(also filed as standalone PR fix(gitignore): anchor "coverage/" to repo root (unblocks src/coverage/, packages/*/coverage/) #127) and adding the missing module; (b)getCoverageStats(source)returned all sources insources[]regardless of the filter — narrowed via WHERE clause + regression test.019-code-health-findings.ts,src/biomarkers/engine (5 biomarkers: Large Method, Complex Method, Nested Complexity, Complex Conditional, Brain Method),IndexHookthat runs after every indexAll/sync,codegraph_biomarkersMCP tool. Languages covered: TS/JS, Python, Go, Java (others skipped gracefully). Independent reviewer pass caught and fixed: stale findings persisted forever when a function was refactored clean,countConditionalOperandsinflated counts via inner lambdas,else_clauseover-counted cyclomatic vs standard McCabe. Smoke-tested on codegraph itself: 397 findings on 198 symbols; top hotspotvisitNodein tree-sitter.ts (Brain Method, cyclomatic 48, nesting 15, Code Health 3/10).The template below remains the canonical "how to add a schema-touching feature on top of refactors" reference.
Rebase template (concrete: cochange = #105)
The cochange PR currently adds:
version: 4entry tomigrations[]insrc/db/migrations.tsCURRENT_SCHEMA_VERSIONto4runDerivedSignals-style hook intoCodeGraph.indexAllandsyncsrc/db/schema.sqlsrc/cochange/index.tsmoduleAfter Phase 2 lands, the rebase becomes:
Step 1. Move the migration body to its own file:
Create
src/db/migrations/004-cochange-graph.ts:Step 2. Register it in
src/db/migrations/index.ts:Step 3. Drop the
CURRENT_SCHEMA_VERSIONbump from your PR. It's auto-derived from the registry now.Step 4. Move the
runDerivedSignalsintegration to a hook filesrc/index-hooks/cochange.ts:Step 5. Register the hook in
src/index-hooks/registry.ts:Step 6. Drop the
runDerivedSignalsprivate method fromsrc/index.tsand the call sites inindexAll/sync. The hook runner already wires up your hook.Same template applies to #112 centrality+churn, #113 issue-history, #114 config-refs, #115 sql-refs, #102 UNIQUE on edges (only the migration step), #108 perf db (if it adds indexes), #111 LLM features (4 migrations: pick 4 free prefixes).
For PRs that also add MCP tools (#112 hotspots, #114 config, #115 sql), follow the MCP tool registration addendum below.
MCP tool registration addendum (Phase 3 + 4 PRs adding tools)
After #117 lands, adding a new MCP tool is:
Step 1. Create
src/mcp/tools/<name>.ts:Step 2. Add 2 lines in
src/mcp/tools/registry.ts:Step 3. Add the new key to
HandlerKeyunion insrc/mcp/tools/types.ts:Step 4. Add
async handleHotspots(args: Record<string, unknown>): Promise<ToolResult> { ... }onToolHandlerinsrc/mcp/tools.ts. Thecaseswitch is gone —execute()finds your handler via the registry.The
ToolHandlerimplements ToolHandlerLikeconstraint catches the case where you add a key toHandlerKeybut forget to implement the method (compile-time error).Validation steps (apply after every PR lands)
Status (April 28, 2026)
Rebases done — no contributor action required:
Fixes #Ncommits #113, feat(graph): config_refs — env var read sites as queryable graph data #114, feat(graph): sql_refs — SQL string-literal call sites (pairs with #95) #115, perf(db): split embeddings + in-memory similarity cache + drop redundant co_changes index #123, feat(coverage): per-symbol code coverage from lcov reports + codegraph_coverage MCP tool #124, feat(biomarkers): static-analysis Code Health findings + codegraph_biomarkers MCP tool #125. Branches force-pushed; each isMERGEABLEon GitHub once its prerequisites land. Each PR has a pinned comment pointing back to this issue.Still TODO (mine):
migrations.tsshape. Mergeable against today'smainas-is, OR apply the Phase 4 cochange template after refactor: file-based migrations — eliminate silent version-collision bugs #118 lands (~5 min of work).Still TODO (external contributors — friendly nudge needed):
src/resolution/frameworks/*and conflicts with fix: extraction/resolution accuracy (BOM, retry strip, framework regex) #101's accuracy fixes; needs reconciliation between the two approaches.Still TODO (CONFLICTING per GitHub — need contributor rebase to current main):
src/vectors/,src/visualizer/,test_python_inheritance.js).Validation (April 28, 2026): all 30 mergeable PRs stacked together as
andreinknv:integration/all-prs— built fresh fromcolbymchenry:mainvia 18 sequential merges. 829/829 tests pass, 8 index-hooks fire, 22 MCP tools register, 18 migrations apply cleanly.indexAll+syncround-trip on 4 real codebases (codegraph itself, asf-platform, asf-dashboard, mcp-obsidian-extended).Of the 35 PR branches: 33 are direct ancestors of
integration/all-prsvia merge commits. The 2 exceptions — #112 (centrality+churn) and #122 (drop-redundant-edge-indexes) — have their content present via post-refactor port commits in the integration history but their original branch tips aren't ancestors. This is intentional: #122 is documented above as remaining in monolithic shape targetingmaindirectly, and #112 was rebased onto post-refactor patterns as a separate commit chain. Both PRs' code is exercised by the test suite via the integration.Findings from rebuilding the integration from scratch (worth knowing for the merge order):
src/extraction/index.ts(or related utils). Sequential merging from upstream/main hits 4 conflicts within Phase 1 alone — they need to be merged in a specific order, or the conflicts hand-resolved. The 4 conflicts are tractable but the "any order" framing is misleading.main— see alsopreview/all-four-refactors-mergedwhich validates this in isolation (417/417 tests).e1b0595-shape (the user's Phase-4 base) can mostly be merged via merging any one offeat/coverage/feat/biomarkers/feat/mcp-server-instructions— that single merge brings in the rest of the Phase-4 stack as ancestors. This is the practical shortcut the maintainer can take.(Earlier integration runs measured 757/757 tests + 13 tools + 7 hooks; the deltas are #124, #125, #126, #121, plus the 16 invariant tests added by #116's reviewer pass.)