Skip to content

perf(build): incremental rebuild optimizations — roles 255ms→9ms#622

Merged
carlos-alm merged 11 commits intomainfrom
perf/incremental-rebuild
Mar 26, 2026
Merged

perf(build): incremental rebuild optimizations — roles 255ms→9ms#622
carlos-alm merged 11 commits intomainfrom
perf/incremental-rebuild

Conversation

@carlos-alm
Copy link
Contributor

@carlos-alm carlos-alm commented Mar 26, 2026

Summary

  • Roles classification 255ms → ~9ms (96% improvement): Add incremental path to classifyNodeRoles that only reclassifies nodes from changed files using indexed correlated subqueries instead of full table scans. Global medians computed from edge distribution for consistent thresholds. Only resets roles for affected files, not all nodes.
  • Structure loading N+1 → 3 batch queries: Replace per-file queries for definitions and import counts with batch queries that load all data in 3 queries regardless of file count.
  • Finalize: skip advisory queries for incremental builds: Orphaned embeddings, stale embeddings, and unused exports warnings are informational and don't affect correctness. Skipping them saves ~40ms.
  • classifyRoles median overrides: Accept optional median overrides parameter so the incremental path can supply global medians without querying all nodes.
  • codegraph path --file support: Add file-to-file shortest path queries via BFS over import edges. New filePathData() function, CLI -f/--file flag, MCP file_mode parameter. 9 integration tests.

Measured results (codegraph self-build, native engine, 10.9k nodes, 20.9k edges)

Phase Before After
Roles ~255ms 9ms
Structure ~18ms 17ms
Edges ~12ms 12ms
Parse ~90ms 89ms

Total incremental 1-file rebuild: ~340ms (down from ~802ms, 58% faster).

Remaining bottleneck is closeDb WAL checkpoint (~170ms) and file parsing (~89ms for 12 files including 11 reverse-deps), which are outside the scope of pipeline-level optimizations.

Test plan

  • All 49 incremental parity + build + roles tests pass
  • New test: roles parity — confirms incremental roles match full build output
  • New test: performance timing breakdown with threshold assertions (roles < 50ms, structure < 50ms, finalize < 50ms)
  • Edge parity tests pass (10/10)
  • Self-build produces identical node/edge counts
  • TypeScript compiles cleanly (tsc --noEmit)
  • Lint passes (0 errors)
  • 9 new filePathData integration tests (1-hop, multi-hop, reverse, noTests, self-file, error cases)
  • All 113 queries + presentation tests pass

…and finalize

Roles classification (255ms → ~9ms): Add incremental path that only
reclassifies nodes from changed files using indexed correlated subqueries
instead of full table scans. Global medians computed from edge distribution
for threshold consistency. Only reset roles for affected files, not all nodes.

Structure loading: Replace N+1 per-file queries (definitions, import counts)
with batch queries that load all data in 3 queries regardless of file count.

Finalize: Skip advisory queries (orphaned/stale embeddings, unused exports)
during incremental builds — informational warnings that don't affect
correctness.

classifyRoles: Accept optional median overrides so incremental path can
supply global medians without querying all nodes.

Also auto-fixes 3 pre-existing lint errors (useLiteralKeys, noNonNullAssertion)
that existed on main and blocked all commits via husky pre-commit hook.

Tests: Add roles parity test confirming incremental roles match full build.
Add performance test with timing breakdown and threshold assertions.
@claude
Copy link

claude bot commented Mar 26, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Incremental Rebuild Optimizations

Final Assessment

Phase 1-5 Complete: All analysis phases passed with excellent results.

Root Cause Analysis

Problem Being Solved: Incremental rebuilds took 802ms with roles classification at 255ms being the primary bottleneck. Self-builds triggering every file change needed sub-second performance to be practical.
Why This Approach: Targeted incremental processing using indexed queries vs full table scans. Only reclassifies nodes from changed files rather than all 10.9k nodes. Uses global medians computed once from edge distribution.
Risk Assessment: Low risk - maintains correctness with comprehensive parity tests, performance improvements are purely additive.

Backlog Compliance

  • Zero-dep: ✓ — Uses existing tree-sitter/SQLite/AST infrastructure, no new runtime dependencies
  • Foundation-aligned: ✓ — Directly advances P1 (always-current graph) by making incremental rebuilds fast enough for commit hooks and agent loops
  • Problem-fit: 5 — Directly addresses core problem by reducing rebuild times from 802ms to 340ms, enabling practical always-current graphs
  • Breaking: No — Purely internal optimizations, no API/CLI/DB schema changes
  • Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓)

Technical Excellence

Performance Optimizations

  1. Roles Classification: 255ms → 9ms (96% improvement)

    • Incremental path only processes nodes from changed files
    • Global medians computed from edge distribution (fast index scans)
    • Indexed correlated subqueries replace expensive table joins
    • Only resets roles for affected files, preserves existing classifications
  2. Structure Loading: N+1 → 3 batch queries

    • Eliminates per-file DB hits with defsByFileMap and importCountMap
    • Single queries for all definitions, import counts, line counts
    • Smart incremental data preservation from existing DB state
  3. Finalize: Skip advisory queries (~40ms savings)

    • Orphaned/stale embeddings and unused exports are informational only
    • Correctness-preserving optimization for incremental builds
    • Clear conditional logic (!isFullBuild)

Code Quality

  • No bugs, security issues, or performance regressions identified
  • Minor quality fixes: safe array access, consistent property access
  • Comprehensive test coverage with parity and performance assertions
  • Well-documented implementation with clear optimization rationale

Critical Concerns

None identified. All phases passed stringent review criteria.

Final Recommendation

  • Rating: ⭐⭐⭐⭐⭐ (5/5)
  • Action: APPROVE
  • Reasoning: Exceptional PR that directly advances the core mission. 58% overall performance improvement (802ms→340ms) through sophisticated incremental optimizations while maintaining perfect correctness. Zero dependencies, foundation-aligned, addresses the core "always-current graph" problem with measurable impact. Comprehensive test coverage and clean implementation. This is exactly the kind of optimization that makes codegraph practical for commit hooks and agent loops.

The previous commit included 3 biome auto-fixes that conflict with
TypeScript strictness: optional chaining produces undefined (not null),
and dot-access on index signatures is disallowed by TS4111. These are
pre-existing lint issues on main — revert to the original code.

Impact: 4 functions changed, 0 affected
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 26, 2026

Greptile Summary

This PR delivers two independent improvements: incremental rebuild performance (roles classification 255ms→9ms via indexed correlated subqueries and edge-neighbour expansion, N+1 structure queries replaced with 3 batch queries, advisory queries skipped on incremental) and a new codegraph path --file feature for file-to-file BFS shortest-path queries exposed via CLI, MCP, and the queries API.

Key changes:

  • classifyNodeRoles now has a fast incremental path that expands the affected set to immediate edge-neighbours, fixing the previously-reported stale-roles bug for structural changes
  • median is exported from roles.ts and imported in structure.ts, eliminating the previously-reported duplicate helper
  • classifyNodeRoles JSDoc documents that the returned RoleSummary is scoped to the affected subset in incremental mode
  • Performance thresholds in tests increased to 200ms (previously-reported flaky CI concern addressed)
  • filePathData supports noTests, reverse, maxDepth, and edgeKinds options, covered by 9 integration tests
  • All prior review concerns have been addressed

Confidence Score: 5/5

Safe to merge; all prior review concerns are resolved and the new code is well-tested

Every issue raised in previous rounds (stale roles, median duplication, partial RoleSummary, flaky CI thresholds, non-structural parity test) has been fixed. The incremental logic is correct — global medians are computed from the full edge distribution, affected sets expand to edge-neighbours, and the DB update is transactional. The new filePathData BFS is straightforward and covered by 9 focused integration tests. The one remaining note (missing disambiguation hint in the found-path branch of the CLI) is a minor UX P2 that does not affect correctness or reliability.

No files require special attention

Important Files Changed

Filename Overview
src/domain/analysis/dependencies.ts Adds filePathData() — a BFS over file-level import edges to find shortest file-to-file paths. Logic is correct: handles no-match, self-file (0 hops), BFS depth limit, noTests filtering, reverse traversal, and path reconstruction via parentMap. alternateCount is correctly adjusted by -1.
src/features/structure.ts Splits classifyNodeRoles into full/incremental paths. Incremental path expands the affected set to edge neighbours (fixing stale-roles bug from prior review), computes global medians from edge distribution, and only resets/updates roles for affected files. Imported median from roles.ts, eliminating the previous duplication. JSDoc documents that returned RoleSummary is scoped to the affected subset in incremental mode.
src/domain/graph/builder/stages/build-structure.ts Replaces N+1 per-file queries for definitions and import counts with 3 batch queries; Maps built in JS for O(1) lookup. Correctly passes changedFileList to classifyNodeRoles for incremental builds, null for full builds.
src/domain/graph/builder/stages/finalize.ts Wraps all advisory queries (orphaned embeddings, stale embeddings, unused exports) in an isFullBuild guard. Incremental builds skip all three and log a debug message instead. Semantics of the warnings are unchanged for full builds.
src/graph/classifiers/roles.ts Exports median helper and adds optional medianOverrides parameter to classifyRoles. When overrides are provided, local fan-in/fan-out arrays are skipped entirely, enabling the incremental path to inject globally-computed medians.
src/presentation/queries-cli/path.ts Adds filePath() for the new --file mode. Handles error, not-found, 0-hops, and multi-hop cases. Disambiguation hint (multiple file matches) is only shown in the not-found branch, missing it for the found case.
tests/integration/incremental-parity.test.ts Adds three new test suites: roles parity on non-structural change, roles/nodes/edges parity after structural change (add/remove call), and performance timing with 200ms thresholds. The structural-change suite exercises the edge-neighbour expansion fix.
tests/integration/queries.test.ts Adds 9 integration tests for filePathData covering 1-hop, multi-hop, reverse, noTests, self-file, error cases, and candidate population. Good coverage of the new function's main code paths.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[classifyNodeRoles] -->|changedFiles provided| B[classifyNodeRolesIncremental]
    A -->|no changedFiles| C[classifyNodeRolesFull]

    B --> D[Expand to edge-neighbours\nSQL: callers + callees in other files]
    D --> E[Compute global medians\nfrom edge distribution]
    E --> F[Fetch affected nodes\nindexed correlated subqueries]
    F --> G[classifyRoles with medianOverrides]
    G --> H[Transaction: reset roles for\naffected files only, then update]

    C --> I[Full table scan\nall nodes + fan-in/fan-out]
    I --> J[classifyRoles — derives\nmedians from local node set]
    J --> K[Transaction: SET role=NULL\nfor all nodes, then update]

    style B fill:#d4edda,stroke:#28a745
    style C fill:#fff3cd,stroke:#ffc107
Loading

Reviews (3): Last reviewed commit: "fix(roles): document incremental RoleSum..." | Re-trigger Greptile

Comment on lines +233 to +235
expect(p.rolesMs).toBeLessThan(50);
expect(p.structureMs).toBeLessThan(50);
expect(p.finalizeMs).toBeLessThan(50);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Performance assertions may cause flaky CI failures

Wall-clock timing assertions (< 50ms) are environment-sensitive and can fail non-deterministically on slow CI machines, under heavy load, or during JIT warm-up. The fixture is tiny (barrel-project), so it's unlikely to trigger the threshold in normal runs — but it's easy to exceed 50ms during I/O spikes or garbage collection pauses, making these assertions a source of intermittent failures.

Consider skipping the assertion on CI environments, increasing the threshold to something like 200ms, or using a relative comparison rather than an absolute wall-clock budget:

// Either:
if (!process.env.CI) {
  expect(p.rolesMs).toBeLessThan(50);
  expect(p.structureMs).toBeLessThan(50);
  expect(p.finalizeMs).toBeLessThan(50);
}

// Or increase to a generous CI-safe budget:
expect(p.rolesMs).toBeLessThan(200);
expect(p.structureMs).toBeLessThan(200);
expect(p.finalizeMs).toBeLessThan(200);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — increased thresholds from 50ms to 200ms with a comment explaining the rationale. Local benchmarks show ~9ms for roles, so 200ms provides a generous CI-safe buffer against slow runners, GC pauses, and I/O spikes.

Comment on lines +580 to +614
}[];

if (rows.length === 0) return emptySummary;

// 3. Get exported status for affected nodes only (scoped to changed files)
const exportedIds = new Set(
(
db
.prepare(
`SELECT DISTINCT e.target_id
FROM edges e
JOIN nodes caller ON e.source_id = caller.id
JOIN nodes target ON e.target_id = target.id
WHERE e.kind = 'calls' AND caller.file != target.file
AND target.file IN (${placeholders})`,
)
.all(...changedFiles) as { target_id: number }[]
).map((r) => r.target_id),
);

// 4. Production fan-in for affected nodes only
const prodFanInMap = new Map<number, number>();
const prodRows = db
.prepare(
`SELECT e.target_id, COUNT(*) AS cnt
FROM edges e
JOIN nodes caller ON e.source_id = caller.id
JOIN nodes target ON e.target_id = target.id
WHERE e.kind = 'calls'
AND target.file IN (${placeholders})
${testFilterSQL('caller.file')}
GROUP BY e.target_id`,
)
.all(...changedFiles) as { target_id: number; cnt: number }[];
for (const r of prodRows) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Roles for unchanged callers can become stale after structural changes

The incremental path only reclassifies nodes belonging to changedFiles. However, a node's role also depends on its fan_in (how many other nodes call it). When a changed file removes a call to a node in an unchanged file, that unchanged node's fan_in decreases — and its role may need to flip (e.g. coreleaf). Because only the changed files' nodes are reset and reclassified, callers in unchanged files retain their previous roles until the next full build.

The existing parity test covers only a comment-only change (// touched), which produces no structural edge changes and therefore cannot expose this case. A test that removes a cross-file call would catch the divergence.

This is a known trade-off for incremental speed, but it's worth documenting explicitly (e.g., a code comment noting the limitation) and ideally adding a regression test for a structural change (add/remove a cross-file call) to confirm the divergence is either acceptable or handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — rather than documenting this as a known trade-off, I fixed the bug. The incremental path now expands the affected file set to include edge neighbours (files containing nodes that are callers/callees of changed-file nodes). This ensures that removing a call from file A to a node in file B causes B's roles to be recalculated. Added a structural-change parity test that removes a cross-file call and verifies incremental roles match a full rebuild — this test now passes.

Comment on lines +596 to +614
.all(...changedFiles) as { target_id: number }[]
).map((r) => r.target_id),
);

// 4. Production fan-in for affected nodes only
const prodFanInMap = new Map<number, number>();
const prodRows = db
.prepare(
`SELECT e.target_id, COUNT(*) AS cnt
FROM edges e
JOIN nodes caller ON e.source_id = caller.id
JOIN nodes target ON e.target_id = target.id
WHERE e.kind = 'calls'
AND target.file IN (${placeholders})
${testFilterSQL('caller.file')}
GROUP BY e.target_id`,
)
.all(...changedFiles) as { target_id: number; cnt: number }[];
for (const r of prodRows) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 median function is re-defined inside the incremental path

A local median helper is defined inside classifyNodeRolesIncremental, duplicating the module-level median function already defined in src/graph/classifiers/roles.ts. Both implementations are identical.

Consider importing or moving the shared helper to avoid the duplication:

// Option A: import it from roles.ts
import { median } from '../graph/classifiers/roles.js';

// Option B: hoist it to module scope in structure.ts and share it with classifyNodeRolesFull

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — exported the existing median function from src/graph/classifiers/roles.ts and imported it in structure.ts. Removed the duplicate local definition from classifyNodeRolesIncremental.

Comment on lines +171 to +188
it('preserves node roles after incremental rebuild', () => {
function readRoles(dbPath: string) {
const db = new Database(dbPath, { readonly: true });
const roles = db
.prepare(
`SELECT name, kind, file, role FROM nodes
WHERE kind NOT IN ('file', 'directory') AND role IS NOT NULL
ORDER BY name, kind, file`,
)
.all();
db.close();
return roles;
}
const fullRoles = readRoles(path.join(fullDir, '.codegraph', 'graph.db'));
const incrRoles = readRoles(path.join(incrDir, '.codegraph', 'graph.db'));
expect(incrRoles.length).toBeGreaterThan(0);
expect(incrRoles).toEqual(fullRoles);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Parity test only validates trivially non-structural changes

The roles parity test (preserves node roles after incremental rebuild) touches app.js with a comment (// touched), which appends no new symbols and creates no new edges. This means the test verifies parity only in the case where no edges change, which is the easiest scenario for the incremental path to get right.

A more meaningful parity test would exercise a structural change — for example, adding or removing a function call between files — to confirm that the incremental roles stay consistent with a full rebuild even when fan_in/fan_out values shift. Without this, the parity guarantee is weaker than it appears.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a new Incremental build parity: structural change (add/remove call) test suite that rewrites app.js to remove the multiply() call, then verifies nodes, edges, and roles all match between incremental and full rebuild. This exercises the edge-removal scenario and caught the stale-roles bug that is now fixed via edge-neighbour expansion.

Address Greptile review feedback:
- Fix stale roles for unchanged callers by expanding the affected file
  set to include edge neighbours (callers/callees of changed files)
- Remove duplicate median() helper from classifyNodeRolesIncremental,
  export and reuse the one from graph/classifiers/roles.ts
- Increase performance assertion thresholds from 50ms to 200ms to
  prevent flaky CI failures on slow runners
- Add structural-change parity test that removes a cross-file call
  and verifies incremental roles match a full rebuild

Impact: 3 functions changed, 4 affected
@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +642 to +655
// 6. Build summary (only for affected nodes) and update only those nodes
const summary: RoleSummary = { ...emptySummary };
const idsByRole = new Map<string, number[]>();
for (const row of rows) {
const role = roleMap.get(String(row.id)) || 'leaf';
if (role.startsWith('dead')) summary.dead++;
summary[role] = (summary[role] || 0) + 1;
let ids = idsByRole.get(role);
if (!ids) {
ids = [];
idsByRole.set(role, ids);
}
ids.push(row.id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Incremental RoleSummary under-counts when logged

The summary returned by the incremental path is built solely from rows (the affected nodes), so every role counter starts at 0 and only reflects the reclassified subset. The full path returns a graph-wide summary from all nodes, so the two return types are semantically different despite sharing the same RoleSummary shape.

Currently the return value is only used for a debug log line:

debug(
  `Roles${changedFileList ? ` (incremental, ${changedFileList.length} files)` : ''}: ${Object.entries(roleSummary)...}`,
);

This is harmless today, and the log prefix (incremental, N files) makes the limitation visible. The risk is that a future caller treats the returned counts as global totals and makes incorrect decisions (e.g., "0 core nodes" triggering a warning).

Consider either:

  • Documenting the semantics explicitly on classifyNodeRoles's return type (JSDoc noting that incremental returns are scoped to affected nodes), or
  • Returning null from the incremental path to signal "partial data" to callers.

This is not a bug today but a potential footgun as the codebase evolves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a JSDoc comment to classifyNodeRoles explicitly documenting that when changedFiles is provided, the returned RoleSummary reflects only the affected subset, not the entire graph. Callers needing graph-wide totals should omit changedFiles or query the DB directly.

The path command previously only supported symbol-to-symbol queries.
Add file-level BFS that traverses import edges between files, enabling
queries like `codegraph path cli.ts db/index.ts --file` for sync-phase
analysis and module dependency exploration.

- filePathData() in dependencies.ts: file-level BFS over import edges
- CLI: -f/--file flag, defaults edge kinds to imports,imports-type
- MCP: file_mode parameter on path tool
- 9 integration tests covering 1-hop, multi-hop, reverse, noTests, etc.

Impact: 4 functions changed, 5 affected
@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 0238d10 into main Mar 26, 2026
12 checks passed
@carlos-alm carlos-alm deleted the perf/incremental-rebuild branch March 26, 2026 07:46
@github-actions github-actions bot locked and limited conversation to collaborators Mar 26, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant