perf(query): fix diffImpact latency regression (#904) by carlos-alm · Pull Request #905 · optave/ops-codegraph-tool

carlos-alm · 2026-04-09T06:51:32Z

Summary

Short-circuit diffImpactData when no functions are affected — skips BFS traversal, co-change lookup, ownership lookup, and boundary checks. All callers (CLI, MCP, benchmark) already return early on empty affectedFunctions, so these lookups were wasted work. Saves ~2-3ms of fixed overhead per invocation.
Pass config to checkBoundaryViolations — eliminates redundant loadConfig(repoRoot) call (was being called twice with different cwd keys, bypassing the cache).
Strip pre-release suffix in semverCompare — semverCompare('3.9.3-dev.6', '3.9.1') returned -1 because Number('3-dev') → NaN → 0, causing shouldSkipNativeOrchestrator to flag all pre-release builds as buggy.

Investigation findings

The 3.9.2 regression (WASM +94%, native +28%) is primarily CI variance on a low baseline — diffImpactData code is identical between 3.9.1 and 3.9.2, and affectedFunctions: 0 in both versions means BFS never runs. The entire measurement is fixed overhead: execFileSync('git diff') (~3-8ms) plus unnecessary lookups (~2-3ms). The WASM +94% (+6.1ms absolute) vs native +28% (+2.1ms) disparity reflects process spawn variance across CI runners.

Closes #904

Test plan

tsc --noEmit passes
biome check passes on changed file
semverCompare unit tests pass (23/23)
CI integration tests validate diffImpact output shape unchanged

`semverCompare('3.9.3-dev.6', '3.9.1')` returned -1 (less than) because `Number('3-dev')` is NaN, which the `|| 0` fallback turned into 0, making the comparison `0 < 1`. This caused `shouldSkipNativeOrchestrator` to flag all pre-release builds as "buggy", disabling the native orchestrator fast path introduced in #897. Strip `-<prerelease>` before splitting on `.` so the numeric comparison sees `3.9.3` vs `3.9.1` correctly.

Skip co-change, ownership, and boundary lookups when findAffectedFunctions returns empty — all callers return early on this case anyway. Also pass the already-loaded config to checkBoundaryViolations to avoid a redundant loadConfig call. Saves ~2-3ms of fixed overhead per diffImpact invocation when the diff touches no function bodies (the common case for comment/import/type-only changes and the benchmark probe). Closes #904

claude · 2026-04-09T06:51:49Z

Claude finished @carlos-alm's task —— View job

Reviewing PR #905: diffImpact latency regression fix

Todo List:

Read and analyze changed files
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity assessment
Phase 4: Code quality review
Phase 5: Backlog compliance grading
Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: Performance regression in diffImpact where fixed overhead operations (co-change lookup, ownership lookup, boundary checks) were being executed even when no functions were affected by changes. Additionally, semverCompare was incorrectly flagging pre-release versions as "buggy" due to NaN comparison issue.

Why This Approach: This is a legitimate optimization addressing real waste. When affectedFunctions.length === 0, the subsequent BFS traversal and expensive lookups provide no value since all callers (CLI, MCP, benchmark) return early on empty results. The semverCompare fix addresses a specific bug where pre-release versions were incorrectly parsed.

Risk Assessment: Low risk. The changes are conservative optimizations that skip work when it would be wasted. The short-circuiting preserves the exact same output structure, just populated with appropriate empty/zero values.

Backlog Compliance

Zero-dep: ✓ — No new runtime dependencies added
Foundation-aligned: ✓ — Pure performance optimization, no architectural changes
Problem-fit: 4 — Improves agent response time by reducing latency overhead
Breaking: No — Output structure identical, purely additive optimization
Tier: 1

Code Quality Assessment

diff-impact.ts Changes ✓

The short-circuit logic at lines 314-334 is well-implemented:

Correctly identifies when no functions are affected
Returns identical structure with appropriate empty values
Preserves all expected fields for callers
The config passing optimization (line 354) eliminates redundant loadConfig() call

update-check.ts Changes ✓

The semverCompare fix (lines 24-25) is correct:

Strips pre-release suffixes before parsing: '3.9.3-dev.6' → '3.9.3'
Prevents NaN from Number('3-dev') causing incorrect comparisons
Maintains backward compatibility for normal semver strings

Test Coverage ✓

New test cases adequately cover the semverCompare pre-release stripping:

Tests both sides with pre-release suffixes (lines 55-62)
Validates the specific bug case: semverCompare('3.9.3-dev.6', '3.9.1') should return 1, not -1

Technical Concerns: None

The short-circuit preserves output contract exactly
No potential for introducing bugs since the skipped operations would have been ignored by callers anyway
Performance improvement is measurable (~2-3ms fixed overhead elimination)

Final Assessment

This is a solid performance optimization that addresses real inefficiencies without changing behavior. The semverCompare fix resolves a legitimate bug affecting pre-release builds. Both changes are technically sound with appropriate test coverage.

The claimed "CI variance" explanation is validated by the code analysis - the regression was indeed fixed overhead being measured on a small baseline, not a functional issue with the core diffImpact logic.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: APPROVE
- Reasoning: Well-executed performance optimization with clear benefits, appropriate test coverage, and no breaking changes. Minor deduction only because it's purely optimization rather than new functionality.

greptile-apps · 2026-04-09T06:54:53Z

Greptile Summary

This PR eliminates a latency regression in diffImpactData via three targeted fixes: (1) a short-circuit when no functions are affected that skips BFS, co-change, and ownership lookups while preserving file-scoped boundary checks (addressing the prior review concern); (2) passing pre-loaded config through to checkBoundaryViolations to avoid a redundant loadConfig() call; and (3) stripping pre-release suffixes in semverCompare so version comparisons like 3.9.3-dev.6 vs 3.9.1 work correctly.

Confidence Score: 5/5

Safe to merge — all three fixes are correct and the prior boundary-violation concern has been resolved.

The short-circuit is logically sound and the return shape aligns with existing early-exit paths (summary: null). Boundary checks are preserved for file-only changes. The config pass-through fix is straightforward. semverCompare now correctly strips pre-release suffixes. No remaining P0/P1 findings.

No files require special attention.

Vulnerabilities

No security concerns identified. The execFileSync call for git diff already uses an array argument (no shell interpolation), and config is passed through rather than re-loaded, reducing the attack surface slightly.

Important Files Changed

Filename	Overview
src/domain/analysis/diff-impact.ts	Short-circuit added for empty `affectedFunctions` case; boundary checks correctly preserved; config pass-through eliminates duplicate `loadConfig` call. Prior review concern about suppressed boundary violations is resolved.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[diffImpactData called] --> B{Git repo?}
    B -- No --> ERR[return error]
    B -- Yes --> C[runGitDiff]
    C --> D{Output empty?}
    D -- Yes --> E[return changedFiles:0, summary:null]
    D -- No --> F[parseGitDiff]
    F --> G{changedRanges empty?}
    G -- Yes --> E
    G -- No --> H[findAffectedFunctions]
    H --> I{affectedFunctions.length === 0?}
    I -- Yes NEW --> J[checkBoundaryViolations file-scoped]
    J --> K[return with historicallyCoupled:empty, ownership:null, summary:null]
    I -- No --> L[buildFunctionImpactResults BFS]
    L --> M[lookupCoChanges]
    M --> N[lookupOwnership]
    N --> O[checkBoundaryViolations file-scoped]
    O --> P[return full result with summary object]

_{Reviews (3): Last reviewed commit: "Merge branch 'main' into fix/semver-prer..." | Re-trigger Greptile}

greptile-apps · 2026-04-09T06:54:57Z

src/domain/analysis/diff-impact.ts

+    if (affectedFunctions.length === 0) {
+      const base = {
+        changedFiles: changedRanges.size,
+        newFiles: [...newFiles],
+        affectedFunctions: [] as unknown[],
+        affectedFiles: [] as string[],
+        historicallyCoupled: [] as unknown[],
+        ownership: null,
+        boundaryViolations: [] as unknown[],
+        boundaryViolationCount: 0,
+        summary: {
+          functionsChanged: 0,
+          callersAffected: 0,
+          filesAffected: 0,
+          historicallyCoupledCount: 0,
+          ownersAffected: 0,
+          boundaryViolationCount: 0,
+        },
+      };
+      return paginateResult(base, 'affectedFunctions', { limit: opts.limit, offset: opts.offset });
+    }


Boundary checks silently skipped for file-only changes

checkBoundaryViolations is scoped to files (scopeFiles: [...changedRanges.keys()]), not functions — so it can surface violations from pure import/type-alias changes even when no function body was touched. The new early return hardcodes boundaryViolations: [] / boundaryViolationCount: 0 for that case, silently suppressing any real violations.

If a caller uses summary.boundaryViolationCount (rather than checking functionsChanged first), it will see 0 even when architectural boundaries are actually violated. The PR description asserts callers return early on empty affectedFunctions, but that claim should be validated before skipping a check that is semantically independent of function-level changes. At minimum, consider preserving the boundary check in this branch:

const { boundaryViolations, boundaryViolationCount } = checkBoundaryViolations( db, changedRanges, noTests, { ...opts, config }, repoRoot, ); const base = { ... boundaryViolations, boundaryViolationCount, summary: { ... boundaryViolationCount, }, }; return paginateResult(base, 'affectedFunctions', { limit: opts.limit, offset: opts.offset });

Fixed in 14bf7c4 — the short-circuit path now preserves the boundary check since checkBoundaryViolations is file-scoped (via scopeFiles: [...changedRanges.keys()]) and can surface real violations even when no function bodies were touched. The return shape is also aligned: summary: null now matches the two existing early-exit paths, so callers that branch on summary === null behave consistently.

github-actions · 2026-04-09T06:59:01Z

Codegraph Impact Analysis

1 functions changed → 3 callers affected across 3 files

diffImpactData in src/domain/analysis/diff-impact.ts:258 (3 transitive callers)

The short-circuit path was hardcoding boundaryViolations: [] when no functions were affected. Since boundary checks are file-scoped (not function-scoped), import or type-alias changes can still produce real violations. Preserve the check and align the return shape (summary: null) with the two existing early-exit paths.

carlos-alm · 2026-04-09T07:29:18Z

Addressed both Greptile findings in 14bf7c4:

Boundary checks preserved — the short-circuit path now calls checkBoundaryViolations instead of hardcoding empty arrays, since boundary evaluation is file-scoped and can catch real violations even when no function bodies changed.
Return shape aligned — the new early return now uses summary: null consistent with the two existing early-exit paths (empty diff output and zero changed ranges), so callers that branch on summary === null behave uniformly.

carlos-alm · 2026-04-09T07:29:40Z

@greptileai

carlos-alm added 2 commits April 8, 2026 23:10

greptile-apps bot reviewed Apr 9, 2026

View reviewed changes

Merge branch 'main' into fix/semver-prerelease

f201774

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(query): fix diffImpact latency regression (#904)#905

perf(query): fix diffImpact latency regression (#904)#905
carlos-alm wants to merge 4 commits intomainfrom
fix/semver-prerelease

carlos-alm commented Apr 9, 2026

Uh oh!

claude bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 9, 2026 •

edited

Loading

Vulnerabilities

Uh oh!

greptile-apps bot Apr 9, 2026

Uh oh!

carlos-alm Apr 9, 2026

Uh oh!

github-actions bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

carlos-alm commented Apr 9, 2026

Uh oh!

carlos-alm commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented Apr 9, 2026

Summary

Investigation findings

Test plan

Uh oh!

claude bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewing PR #905: diffImpact latency regression fix

Root Cause Analysis

Backlog Compliance

Code Quality Assessment

diff-impact.ts Changes ✓

update-check.ts Changes ✓

Test Coverage ✓

Technical Concerns: None

Final Assessment

Uh oh!

greptile-apps bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Vulnerabilities

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codegraph Impact Analysis

Uh oh!

carlos-alm commented Apr 9, 2026

Uh oh!

carlos-alm commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Apr 9, 2026 •

edited

Loading

greptile-apps bot commented Apr 9, 2026 •

edited

Loading

github-actions bot commented Apr 9, 2026 •

edited

Loading