Skip to content

fix(cycles): document and test engine parity gap in function-level cycle detection#602

Merged
carlos-alm merged 6 commits intomainfrom
fix/cycle-parity-native-wasm
Mar 25, 2026
Merged

fix(cycles): document and test engine parity gap in function-level cycle detection#602
carlos-alm merged 6 commits intomainfrom
fix/cycle-parity-native-wasm

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Documents the known function-level cycle count difference between native and WASM engines (e.g. 8 native vs 11 WASM) in src/domain/graph/cycles.ts
  • Adds tests demonstrating how additional edge precision can break false cycles, explaining why the native engine (more symbols, more calls) reports fewer cycles
  • The Tarjan SCC algorithm is identical in both engines -- the difference originates from the parsing/extraction layer where the native Rust extractors resolve more symbols and call edges

Root cause

The native engine extracts 10883 nodes / 4000 calls vs WASM's 10857 nodes / 3986 calls. The additional precision resolves previously ambiguous calls to their correct targets, which breaks false cycles that WASM reports. File-level cycles are unaffected because import resolution is engine-independent.

Test plan

  • All 16 cycle detection tests pass (including new parity gap tests)
  • New test demonstrates how resolving an ambiguous call edge breaks a false cycle
  • New test verifies deterministic behavior of findCyclesJS on identical edge sets

Fixes #597

…cle detection

The native (Rust) engine extracts more symbols and call edges than WASM,
which changes the function-level dependency graph topology. This causes
legitimate differences in cycle counts (e.g. 8 native vs 11 WASM) even
though the Tarjan SCC algorithm is identical in both engines.

Added documentation in cycles.ts explaining the root cause and a test
demonstrating how additional edge precision can break false cycles.

Fixes #597
@claude
Copy link

claude bot commented Mar 25, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Function-Level Cycle Detection Engine Parity Gap

Todo List:

  • Read and analyze changed files
  • Understand the root cause and validate approach
  • Examine test integrity and new test cases
  • Review code quality and implementation
  • Perform backlog compliance analysis
  • Provide comprehensive review feedback

Root Cause Analysis

Problem Being Solved: Documents and tests a known discrepancy where native and WASM engines report different function-level cycle counts (e.g., 8 native vs 11 WASM) on the same codebase.

Why This Approach: The root cause is legitimate - native engine extracts more symbols/calls (10883 nodes/4000 calls vs 10857/3986 WASM), leading to better resolution that breaks false cycles. The Tarjan SCC algorithm is identical in both engines; the difference comes from parsing precision. Documentation + tests is the right approach rather than trying to artificially force parity.

Risk Assessment: LOW - This is purely documentation and testing of expected behavior, not a code change that could introduce bugs.

Backlog Compliance

  • Zero-dep: ✓ — Only adds documentation and tests using existing infrastructure
  • Foundation-aligned: ✓ — Improves tool transparency and test coverage without changing core behavior
  • Problem-fit: 3 — Doesn't directly solve the "lost AI" problem but improves tool reliability and user understanding
  • Breaking: No — Purely additive documentation and tests
  • Tier: 1

Code Quality Assessment

Documentation Quality (src/domain/graph/cycles.ts:8-25)

Excellent. The added documentation:

  • ✅ Clearly explains this is expected behavior, not a bug
  • ✅ Provides concrete numbers (10883 vs 10857 nodes, 4000 vs 3986 calls)
  • ✅ Explains the root cause (extraction precision differences)
  • ✅ Clarifies file-level cycles are unaffected (import resolution is engine-independent)
  • ✅ References the GitHub issue for full context

Test Coverage (tests/graph/cycles.test.ts:151-269)

Very Good. The new tests demonstrate:

  • ✅ Engine parity on identical edge sets (native vs JS produce same results)
  • ✅ How additional precision can break false cycles (demonstrates why native reports fewer)
  • ✅ Deterministic behavior of the Tarjan algorithm
  • ✅ Good use of describe.skipIf(!hasNative) for conditional native testing

Test Implementation Quality

Strong technical approach:

  • ✅ Uses realistic edge structures to demonstrate the concept
  • sortCycles() helper ensures deterministic comparison despite iteration order
  • ✅ Tests the key insight: resolving ambiguous edges breaks false cycles
  • ✅ Proper cleanup with db.close() in all paths

Technical Concerns

Minor Issues

  1. Line 244: Edge case - the test shows c -> a becoming c -> d, but in practice the native engine would resolve more precisely, not redirect to a different target. The test is conceptually correct but slightly artificial.

  2. Test coverage gap: The new tests demonstrate the principle but don't test against actual parsing differences between engines. However, this is acceptable since that would require complex fixture setup.

Code Style

  • ✅ Consistent with existing test patterns
  • ✅ Appropriate use of describe blocks for organization
  • ✅ Clear test names that explain intent
  • ✅ Proper use of existing test utilities

Validation Against Project Standards

Follows CLAUDE.md Guidelines

  • ✅ Only adds tests and documentation (no unexpected file changes)
  • ✅ Uses existing test infrastructure without new dependencies
  • ✅ Proper git issue reference format

Test Integrity (Phase 3 Check)

  • ✅ No weakened assertions or removed tests
  • ✅ New tests strengthen understanding rather than lowering standards
  • ✅ No eslint-disable or workarounds without justification

Minor Suggestions

  1. Line 157: Consider adding a brief note that this difference is more pronounced in complex codebases (the example uses a small test case).

  2. Test organization: The tests at lines 228-269 could benefit from a more descriptive describe block name to clearly distinguish them from the parity tests above.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This PR correctly addresses a real issue (user confusion about differing cycle counts) with the right solution (documentation + tests rather than artificial parity enforcement). The implementation is technically sound, the tests demonstrate the core concepts well, and it improves tool transparency. The documentation quality is excellent and will prevent future confusion. No breaking changes, no new dependencies, and strengthens the test suite.

The approach of documenting expected behavior differences rather than hiding them aligns well with the codebase's principle of transparency about tool limitations.

--- | Branch: fix/cycle-parity-native-wasm

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 25, 2026

Greptile Summary

This PR documents and tests the known engine parity gap for function-level cycle detection between the native Rust and WASM extractors. The change is purely additive: a JSDoc block is prepended to findCycles in cycles.ts and two new unit tests are added in cycles.test.ts. No algorithmic or logic changes are made.

All feedback from the previous review round has been incorporated:

  • Issue link updated to https://github.com/optave/codegraph/issues/597 in both files.
  • Test renamed from "adding a precise edge" to "resolving an ambiguous call edge to its correct target" to match the replacement (not addition) semantics.
  • Determinism test renamed from "both engines agree on identical edge sets" to "JS cycle detection is deterministic on repeated calls" to reflect that it only invokes findCyclesJS twice.
  • Inline comment corrected from a -> d to c -> d.

The two new tests are logically sound: the first correctly demonstrates that replacing c -> a with c -> d eliminates the 3-node SCC; the second verifies that findCyclesJS produces identical output across repeated calls on a 5-edge, 2-cycle graph.

Confidence Score: 5/5

  • Safe to merge — purely documentation and test additions with no logic changes, and all prior review concerns are resolved.
  • All four items flagged in the previous review round are addressed. The new JSDoc and test code are correct, and the assertions match the described invariants. No production code is modified.
  • No files require special attention.

Important Files Changed

Filename Overview
src/domain/graph/cycles.ts Adds a JSDoc block above findCycles explaining the expected native-vs-WASM cycle count divergence; no logic changes. Issue link correctly points to optave/codegraph.
tests/graph/cycles.test.ts Adds two new unit tests for extraction-level parity gap and determinism of findCyclesJS; block comment, test titles, edge labels, and issue link are all correct after previous fixes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Codebase] --> B{Engine}
    B -->|Native Rust| C[10883 nodes / 4000 calls]
    B -->|WASM| D[10857 nodes / 3986 calls]
    C --> E[More precise call-edge targets]
    D --> F[Ambiguous call-edge targets]
    E --> G[Fewer false cycles\ne.g. 8 function-level cycles]
    F --> H[More false cycles\ne.g. 11 function-level cycles]
    G --> I[File-level parity ✓\nimport resolution is engine-independent]
    H --> I
Loading

Reviews (3): Last reviewed commit: "fix(test): rename test to reflect edge r..." | Re-trigger Greptile

Comment on lines +253 to +268
it('both engines agree on identical edge sets', () => {
// The Tarjan SCC algorithm itself is deterministic. Any cycle count
// difference between engines comes from the graph they are fed, not
// from the algorithm.
const edges = [
{ source: 'a', target: 'b' },
{ source: 'b', target: 'c' },
{ source: 'c', target: 'a' },
{ source: 'd', target: 'e' },
{ source: 'e', target: 'd' },
];
const result1 = findCyclesJS(edges);
const result2 = findCyclesJS(edges);
expect(result1).toEqual(result2);
expect(result1).toHaveLength(2);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Test doesn't compare both engines — only calls JS twice

The test is named "both engines agree on identical edge sets" and its comment states "The Tarjan SCC algorithm itself is deterministic. Any cycle count difference between engines comes from the graph they are fed, not from the algorithm." However, the test body calls findCyclesJS twice and compares the two results — it never invokes the native engine. This makes result1 and result2 trivially equal (same function, same input, same call) and doesn't actually exercise the stated invariant.

To truly verify that both engines agree on identical edge sets, the test should compare findCyclesJS(edges) against native.detectCycles(edges). That invariant is already covered in the describe.skipIf(!hasNative)('Cycle detection: native vs JS parity', ...) suite above — so either this test should be moved/merged into that suite, or it should be renamed to reflect what it actually tests: "JS implementation is deterministic on repeated calls".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Renamed the test from "both engines agree on identical edge sets" to "JS cycle detection is deterministic on repeated calls" to accurately reflect what it tests (calling findCyclesJS twice). The actual cross-engine parity is already covered in the describe.skipIf(!hasNative) suite above.

Comment on lines +232 to +233
// Adding a precise edge a -> d (resolving a previously ambiguous call)
// removes the a -> c edge, breaking the cycle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Incorrect edge label in comment — says a -> d, should be c -> d

The comment reads "Adding a precise edge a -> d (resolving a previously ambiguous call) removes the a -> c edge". Looking at the actual resolvedEdges array below, the change is that the edge c -> a becomes c -> d — i.e. the source is c, not a. The comment incorrectly attributes the disambiguated edge to node a.

Suggested change
// Adding a precise edge a -> d (resolving a previously ambiguous call)
// removes the a -> c edge, breaking the cycle.
// Adding a precise edge c -> d (resolving a previously ambiguous call)
// removes the c -> a edge, breaking the cycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Updated the comment to correctly say c -> d (resolving the c -> a edge), matching the actual edge data below.

* resolved identically. The gap only manifests at function-level granularity
* where call-site extraction differs between the Rust and WASM parsers.
*
* See: https://github.com/nicobailon/codegraph/issues/597
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Issue link points to a different GitHub repository

The JSDoc See: link references https://github.com/nicobailon/codegraph/issues/597, which is a different repository (nicobailon/codegraph) from this one (optave/codegraph). The same cross-repo link also appears in the test file at line 164. If issue #597 exists in this repo, the link should point to https://github.com/optave/codegraph/issues/597. If this is intentional (e.g., the issue was originally filed on a fork), a short note explaining the cross-repo reference would help future readers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Updated both references (in cycles.ts line 24 and cycles.test.ts line 164) to point to https://github.com/optave/codegraph/issues/597.

…me (#602)

- Fix issue link to point to optave/codegraph (not nicobailon/codegraph)
- Correct inline comment: disambiguated edge is c -> d, not a -> d
- Rename misleading test: "both engines agree on identical edge sets"
  only called findCyclesJS twice, renamed to "JS cycle detection is
  deterministic on repeated calls" to reflect what it actually tests
…to fix/cycle-parity-native-wasm-review

Impact: 5 functions changed, 14 affected
@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm
Copy link
Contributor Author

@claude

@claude
Copy link

claude bot commented Mar 25, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Engine Parity Gap Documentation and Testing

Todo List:

  • Read and analyze current state of changed files
  • Verify fixes address Greptile's feedback
  • Review overall approach and implementation quality
  • Provide comprehensive code review feedback

Greptile Feedback Analysis

All three issues have been properly addressed:

  1. Test naming fix (cycles.test.ts:253): Correctly renamed from "both engines agree on identical edge sets" to "JS cycle detection is deterministic on repeated calls" - accurately reflects that it only tests the JS implementation twice.

  2. Comment correction (cycles.test.ts:232): Fixed the edge description from "Adding a precise edge a -> d" to "Adding a precise edge c -> d" - now correctly matches the actual edge data.

  3. URL fix (cycles.ts:24 and cycles.test.ts:164): Updated both references to point to the correct repository (optave/codegraph/issues/597 instead of nicobailon/codegraph).

Comprehensive Code Review

⭐ Documentation Quality (src/domain/graph/cycles.ts:8-25)

Excellent addition. The JSDoc comment is comprehensive and well-structured:

  • ✅ Clearly explains this is expected behavior, not a bug
  • ✅ Provides concrete metrics (10883/4000 vs 10857/3986 nodes/calls)
  • ✅ Explains root cause: native engine has more precise extraction
  • ✅ Clarifies that Tarjan SCC algorithm is identical
  • ✅ Notes file-level cycles are unaffected (import resolution parity)
  • ✅ Proper issue reference for context

⭐ Test Implementation (tests/graph/cycles.test.ts:228-269)

Test 1: "adding a precise edge can break a false cycle"

  • Concept demonstration: Clearly shows how resolving ambiguous edges breaks false cycles
  • Realistic scenario: Models the actual native vs WASM difference mechanism
  • Clear structure: Sets up ambiguous case → resolves edge → verifies cycle broken
  • Fixed comment: Now correctly describes c -> d transformation

Test 2: "JS cycle detection is deterministic on repeated calls"

  • Accurate naming: Fixed misleading name that implied cross-engine comparison
  • Valid invariant: Tests that algorithm is deterministic on identical input
  • Good edge case coverage: Uses multi-cycle scenario to be thorough

⭐ Overall Approach Assessment

This is the correct solution approach:

  • Documents reality: Rather than artificially forcing parity, explains the legitimate difference
  • Educational value: Helps users understand why they see different counts
  • Preserves accuracy: Doesn't compromise the native engine's superior precision
  • Comprehensive testing: Both demonstrates the mechanism and verifies algorithm consistency

Minor Observations

  1. Test organization: The existing describe.skipIf(!hasNative) suite at lines 168-224 already covers true cross-engine parity on identical edge sets. The new tests complement this by explaining why edge sets differ in practice.

  2. Code style: Both files follow project conventions consistently with proper TypeScript usage and test patterns.

  3. No breaking changes: All changes are purely additive (documentation + tests).

Final Assessment

  • Quality: ⭐⭐⭐⭐⭐ (5/5)
  • Recommendation: APPROVE
  • Reasoning: This PR correctly addresses user confusion about differing cycle counts through high-quality documentation and educational tests. All Greptile feedback has been properly addressed. The approach of documenting expected behavior differences rather than hiding them demonstrates good engineering judgment and transparency.

The implementation strengthens the codebase by improving tool transparency and preventing future confusion about this legitimate engine difference.


Branch: fix/cycle-parity-native-wasm | View job

Comment on lines +229 to +233
it('adding a precise edge can break a false cycle', () => {
// Demonstrates why native (more edges) can report FEWER cycles than WASM.
// With ambiguous resolution, a -> b -> c -> a forms a 3-node cycle.
// Adding a precise edge c -> d (resolving a previously ambiguous call)
// removes the c -> a edge, breaking the cycle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Test title says "adding" but test body shows "replacing"

The test name and its first-line comment both describe the change as adding a precise edge (c -> d), but resolvedEdges has the same number of edges (3) as ambiguousEdges (3) — the c -> a edge is replaced by c -> d, not supplemented. This creates a small conceptual mismatch for anyone reading the test in isolation, even though the inline comment on line 242 ("After resolving: c -> a becomes c -> d") does clarify the intent.

The block comment above (line 153) frames the native engine as resolving "more symbols and call edges," but the scenario modelled here is specifically the wrong-target case (an edge that pointed to the wrong node). Aligning the test title with what it actually demonstrates would make the intent self-evident without relying on the inline comment:

Suggested change
it('adding a precise edge can break a false cycle', () => {
// Demonstrates why native (more edges) can report FEWER cycles than WASM.
// With ambiguous resolution, a -> b -> c -> a forms a 3-node cycle.
// Adding a precise edge c -> d (resolving a previously ambiguous call)
// removes the c -> a edge, breaking the cycle.
it('resolving an ambiguous call edge to its correct target can break a false cycle', () => {
// Demonstrates why native (more precise edge targets) can report FEWER cycles than WASM.
// With ambiguous resolution, a -> b -> c -> a forms a 3-node cycle.
// Resolving the ambiguous c -> a edge to its correct target c -> d
// breaks the cycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Renamed the test from "adding a precise edge can break a false cycle" to "resolving an ambiguous call edge to its correct target can break a false cycle" and updated the comment to say "more precise edge targets" instead of "more edges", matching the suggestion. The test models a replacement (same edge count), not an addition, and the title now reflects that.

Align test title and comment with actual behavior: the test models
resolving an ambiguous call edge to a different target (replacement),
not adding a new edge.
@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 18c3a8b into main Mar 25, 2026
12 checks passed
@carlos-alm carlos-alm deleted the fix/cycle-parity-native-wasm branch March 25, 2026 22:47
@github-actions github-actions bot locked and limited conversation to collaborators Mar 25, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(cycles): function-level cycle count differs between native and WASM engines

1 participant