perf(build): incremental rebuild optimizations — roles 255ms→9ms#622
perf(build): incremental rebuild optimizations — roles 255ms→9ms#622carlos-alm merged 11 commits intomainfrom
Conversation
…and finalize Roles classification (255ms → ~9ms): Add incremental path that only reclassifies nodes from changed files using indexed correlated subqueries instead of full table scans. Global medians computed from edge distribution for threshold consistency. Only reset roles for affected files, not all nodes. Structure loading: Replace N+1 per-file queries (definitions, import counts) with batch queries that load all data in 3 queries regardless of file count. Finalize: Skip advisory queries (orphaned/stale embeddings, unused exports) during incremental builds — informational warnings that don't affect correctness. classifyRoles: Accept optional median overrides so incremental path can supply global medians without querying all nodes. Also auto-fixes 3 pre-existing lint errors (useLiteralKeys, noNonNullAssertion) that existed on main and blocked all commits via husky pre-commit hook. Tests: Add roles parity test confirming incremental roles match full build. Add performance test with timing breakdown and threshold assertions.
|
Claude finished @carlos-alm's task —— View job PR Review: Incremental Rebuild OptimizationsFinal AssessmentPhase 1-5 Complete: All analysis phases passed with excellent results. Root Cause AnalysisProblem Being Solved: Incremental rebuilds took 802ms with roles classification at 255ms being the primary bottleneck. Self-builds triggering every file change needed sub-second performance to be practical. Backlog Compliance
Technical ExcellencePerformance Optimizations
Code Quality
Critical ConcernsNone identified. All phases passed stringent review criteria. Final Recommendation
|
The previous commit included 3 biome auto-fixes that conflict with TypeScript strictness: optional chaining produces undefined (not null), and dot-access on index signatures is disallowed by TS4111. These are pre-existing lint issues on main — revert to the original code. Impact: 4 functions changed, 0 affected
Greptile SummaryThis PR delivers two independent improvements: incremental rebuild performance (roles classification 255ms→9ms via indexed correlated subqueries and edge-neighbour expansion, N+1 structure queries replaced with 3 batch queries, advisory queries skipped on incremental) and a new Key changes:
Confidence Score: 5/5Safe to merge; all prior review concerns are resolved and the new code is well-tested Every issue raised in previous rounds (stale roles, median duplication, partial RoleSummary, flaky CI thresholds, non-structural parity test) has been fixed. The incremental logic is correct — global medians are computed from the full edge distribution, affected sets expand to edge-neighbours, and the DB update is transactional. The new No files require special attention Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[classifyNodeRoles] -->|changedFiles provided| B[classifyNodeRolesIncremental]
A -->|no changedFiles| C[classifyNodeRolesFull]
B --> D[Expand to edge-neighbours\nSQL: callers + callees in other files]
D --> E[Compute global medians\nfrom edge distribution]
E --> F[Fetch affected nodes\nindexed correlated subqueries]
F --> G[classifyRoles with medianOverrides]
G --> H[Transaction: reset roles for\naffected files only, then update]
C --> I[Full table scan\nall nodes + fan-in/fan-out]
I --> J[classifyRoles — derives\nmedians from local node set]
J --> K[Transaction: SET role=NULL\nfor all nodes, then update]
style B fill:#d4edda,stroke:#28a745
style C fill:#fff3cd,stroke:#ffc107
Reviews (3): Last reviewed commit: "fix(roles): document incremental RoleSum..." | Re-trigger Greptile |
| expect(p.rolesMs).toBeLessThan(50); | ||
| expect(p.structureMs).toBeLessThan(50); | ||
| expect(p.finalizeMs).toBeLessThan(50); |
There was a problem hiding this comment.
Performance assertions may cause flaky CI failures
Wall-clock timing assertions (< 50ms) are environment-sensitive and can fail non-deterministically on slow CI machines, under heavy load, or during JIT warm-up. The fixture is tiny (barrel-project), so it's unlikely to trigger the threshold in normal runs — but it's easy to exceed 50ms during I/O spikes or garbage collection pauses, making these assertions a source of intermittent failures.
Consider skipping the assertion on CI environments, increasing the threshold to something like 200ms, or using a relative comparison rather than an absolute wall-clock budget:
// Either:
if (!process.env.CI) {
expect(p.rolesMs).toBeLessThan(50);
expect(p.structureMs).toBeLessThan(50);
expect(p.finalizeMs).toBeLessThan(50);
}
// Or increase to a generous CI-safe budget:
expect(p.rolesMs).toBeLessThan(200);
expect(p.structureMs).toBeLessThan(200);
expect(p.finalizeMs).toBeLessThan(200);There was a problem hiding this comment.
Fixed — increased thresholds from 50ms to 200ms with a comment explaining the rationale. Local benchmarks show ~9ms for roles, so 200ms provides a generous CI-safe buffer against slow runners, GC pauses, and I/O spikes.
| }[]; | ||
|
|
||
| if (rows.length === 0) return emptySummary; | ||
|
|
||
| // 3. Get exported status for affected nodes only (scoped to changed files) | ||
| const exportedIds = new Set( | ||
| ( | ||
| db | ||
| .prepare( | ||
| `SELECT DISTINCT e.target_id | ||
| FROM edges e | ||
| JOIN nodes caller ON e.source_id = caller.id | ||
| JOIN nodes target ON e.target_id = target.id | ||
| WHERE e.kind = 'calls' AND caller.file != target.file | ||
| AND target.file IN (${placeholders})`, | ||
| ) | ||
| .all(...changedFiles) as { target_id: number }[] | ||
| ).map((r) => r.target_id), | ||
| ); | ||
|
|
||
| // 4. Production fan-in for affected nodes only | ||
| const prodFanInMap = new Map<number, number>(); | ||
| const prodRows = db | ||
| .prepare( | ||
| `SELECT e.target_id, COUNT(*) AS cnt | ||
| FROM edges e | ||
| JOIN nodes caller ON e.source_id = caller.id | ||
| JOIN nodes target ON e.target_id = target.id | ||
| WHERE e.kind = 'calls' | ||
| AND target.file IN (${placeholders}) | ||
| ${testFilterSQL('caller.file')} | ||
| GROUP BY e.target_id`, | ||
| ) | ||
| .all(...changedFiles) as { target_id: number; cnt: number }[]; | ||
| for (const r of prodRows) { |
There was a problem hiding this comment.
Roles for unchanged callers can become stale after structural changes
The incremental path only reclassifies nodes belonging to changedFiles. However, a node's role also depends on its fan_in (how many other nodes call it). When a changed file removes a call to a node in an unchanged file, that unchanged node's fan_in decreases — and its role may need to flip (e.g. core → leaf). Because only the changed files' nodes are reset and reclassified, callers in unchanged files retain their previous roles until the next full build.
The existing parity test covers only a comment-only change (// touched), which produces no structural edge changes and therefore cannot expose this case. A test that removes a cross-file call would catch the divergence.
This is a known trade-off for incremental speed, but it's worth documenting explicitly (e.g., a code comment noting the limitation) and ideally adding a regression test for a structural change (add/remove a cross-file call) to confirm the divergence is either acceptable or handled.
There was a problem hiding this comment.
Fixed — rather than documenting this as a known trade-off, I fixed the bug. The incremental path now expands the affected file set to include edge neighbours (files containing nodes that are callers/callees of changed-file nodes). This ensures that removing a call from file A to a node in file B causes B's roles to be recalculated. Added a structural-change parity test that removes a cross-file call and verifies incremental roles match a full rebuild — this test now passes.
src/features/structure.ts
Outdated
| .all(...changedFiles) as { target_id: number }[] | ||
| ).map((r) => r.target_id), | ||
| ); | ||
|
|
||
| // 4. Production fan-in for affected nodes only | ||
| const prodFanInMap = new Map<number, number>(); | ||
| const prodRows = db | ||
| .prepare( | ||
| `SELECT e.target_id, COUNT(*) AS cnt | ||
| FROM edges e | ||
| JOIN nodes caller ON e.source_id = caller.id | ||
| JOIN nodes target ON e.target_id = target.id | ||
| WHERE e.kind = 'calls' | ||
| AND target.file IN (${placeholders}) | ||
| ${testFilterSQL('caller.file')} | ||
| GROUP BY e.target_id`, | ||
| ) | ||
| .all(...changedFiles) as { target_id: number; cnt: number }[]; | ||
| for (const r of prodRows) { |
There was a problem hiding this comment.
median function is re-defined inside the incremental path
A local median helper is defined inside classifyNodeRolesIncremental, duplicating the module-level median function already defined in src/graph/classifiers/roles.ts. Both implementations are identical.
Consider importing or moving the shared helper to avoid the duplication:
// Option A: import it from roles.ts
import { median } from '../graph/classifiers/roles.js';
// Option B: hoist it to module scope in structure.ts and share it with classifyNodeRolesFullThere was a problem hiding this comment.
Fixed — exported the existing median function from src/graph/classifiers/roles.ts and imported it in structure.ts. Removed the duplicate local definition from classifyNodeRolesIncremental.
| it('preserves node roles after incremental rebuild', () => { | ||
| function readRoles(dbPath: string) { | ||
| const db = new Database(dbPath, { readonly: true }); | ||
| const roles = db | ||
| .prepare( | ||
| `SELECT name, kind, file, role FROM nodes | ||
| WHERE kind NOT IN ('file', 'directory') AND role IS NOT NULL | ||
| ORDER BY name, kind, file`, | ||
| ) | ||
| .all(); | ||
| db.close(); | ||
| return roles; | ||
| } | ||
| const fullRoles = readRoles(path.join(fullDir, '.codegraph', 'graph.db')); | ||
| const incrRoles = readRoles(path.join(incrDir, '.codegraph', 'graph.db')); | ||
| expect(incrRoles.length).toBeGreaterThan(0); | ||
| expect(incrRoles).toEqual(fullRoles); | ||
| }); |
There was a problem hiding this comment.
Parity test only validates trivially non-structural changes
The roles parity test (preserves node roles after incremental rebuild) touches app.js with a comment (// touched), which appends no new symbols and creates no new edges. This means the test verifies parity only in the case where no edges change, which is the easiest scenario for the incremental path to get right.
A more meaningful parity test would exercise a structural change — for example, adding or removing a function call between files — to confirm that the incremental roles stay consistent with a full rebuild even when fan_in/fan_out values shift. Without this, the parity guarantee is weaker than it appears.
There was a problem hiding this comment.
Fixed — added a new Incremental build parity: structural change (add/remove call) test suite that rewrites app.js to remove the multiply() call, then verifies nodes, edges, and roles all match between incremental and full rebuild. This exercises the edge-removal scenario and caught the stale-roles bug that is now fixed via edge-neighbour expansion.
Address Greptile review feedback: - Fix stale roles for unchanged callers by expanding the affected file set to include edge neighbours (callers/callees of changed files) - Remove duplicate median() helper from classifyNodeRolesIncremental, export and reuse the one from graph/classifiers/roles.ts - Increase performance assertion thresholds from 50ms to 200ms to prevent flaky CI failures on slow runners - Add structural-change parity test that removes a cross-file call and verifies incremental roles match a full rebuild Impact: 3 functions changed, 4 affected
| // 6. Build summary (only for affected nodes) and update only those nodes | ||
| const summary: RoleSummary = { ...emptySummary }; | ||
| const idsByRole = new Map<string, number[]>(); | ||
| for (const row of rows) { | ||
| const role = roleMap.get(String(row.id)) || 'leaf'; | ||
| if (role.startsWith('dead')) summary.dead++; | ||
| summary[role] = (summary[role] || 0) + 1; | ||
| let ids = idsByRole.get(role); | ||
| if (!ids) { | ||
| ids = []; | ||
| idsByRole.set(role, ids); | ||
| } | ||
| ids.push(row.id); | ||
| } |
There was a problem hiding this comment.
Incremental
RoleSummary under-counts when logged
The summary returned by the incremental path is built solely from rows (the affected nodes), so every role counter starts at 0 and only reflects the reclassified subset. The full path returns a graph-wide summary from all nodes, so the two return types are semantically different despite sharing the same RoleSummary shape.
Currently the return value is only used for a debug log line:
debug(
`Roles${changedFileList ? ` (incremental, ${changedFileList.length} files)` : ''}: ${Object.entries(roleSummary)...}`,
);This is harmless today, and the log prefix (incremental, N files) makes the limitation visible. The risk is that a future caller treats the returned counts as global totals and makes incorrect decisions (e.g., "0 core nodes" triggering a warning).
Consider either:
- Documenting the semantics explicitly on
classifyNodeRoles's return type (JSDoc noting that incremental returns are scoped to affected nodes), or - Returning
nullfrom the incremental path to signal "partial data" to callers.
This is not a bug today but a potential footgun as the codebase evolves.
There was a problem hiding this comment.
Fixed — added a JSDoc comment to classifyNodeRoles explicitly documenting that when changedFiles is provided, the returned RoleSummary reflects only the affected subset, not the entire graph. Callers needing graph-wide totals should omit changedFiles or query the DB directly.
The path command previously only supported symbol-to-symbol queries. Add file-level BFS that traverses import edges between files, enabling queries like `codegraph path cli.ts db/index.ts --file` for sync-phase analysis and module dependency exploration. - filePathData() in dependencies.ts: file-level BFS over import edges - CLI: -f/--file flag, defaults edge kinds to imports,imports-type - MCP: file_mode parameter on path tool - 9 integration tests covering 1-hop, multi-hop, reverse, noTests, etc. Impact: 4 functions changed, 5 affected
Impact: 18 functions changed, 31 affected
…codegraph into perf/incremental-rebuild
Summary
classifyNodeRolesthat only reclassifies nodes from changed files using indexed correlated subqueries instead of full table scans. Global medians computed from edge distribution for consistent thresholds. Only resets roles for affected files, not all nodes.codegraph path --filesupport: Add file-to-file shortest path queries via BFS over import edges. NewfilePathData()function, CLI-f/--fileflag, MCPfile_modeparameter. 9 integration tests.Measured results (codegraph self-build, native engine, 10.9k nodes, 20.9k edges)
Total incremental 1-file rebuild: ~340ms (down from ~802ms, 58% faster).
Remaining bottleneck is
closeDbWAL checkpoint (~170ms) and file parsing (~89ms for 12 files including 11 reverse-deps), which are outside the scope of pipeline-level optimizations.Test plan
tsc --noEmit)