Skip to content

fix: deduplicate symbol tracing to prevent exponential time on large diffs#48

Merged
EladBezalel merged 2 commits intomainfrom
fix/deduplicate-symbol-tracing
Mar 24, 2026
Merged

fix: deduplicate symbol tracing to prevent exponential time on large diffs#48
EladBezalel merged 2 commits intomainfrom
fix/deduplicate-symbol-tracing

Conversation

@EladBezalel
Copy link
Copy Markdown
Collaborator

@EladBezalel EladBezalel commented Mar 24, 2026

Summary

Closes #47

  • Pre-deduplicate changed symbols before recursive reference tracing — N changed lines inside the same exported object now produce 1 traversal instead of N redundant traversals each with a fresh visited set
  • Compute find_node_at_line once per changed line and share results between report-building and deduplication passes (was called 2x per line before)
  • Remove dead process_changed_line function replaced by inline deduplication logic
  • Add regression test with a 200→300 entry exported object verifying downstream consumers are correctly detected

Expected impact: 10+ minute hangs on large single-export diffs → ~12 seconds (matching performance of single-symbol traces).

Test plan

  • cargo test --lib — 169 tests pass
  • cargo clippy --lib — no warnings
  • cargo fmt --all — clean
  • cargo test --test integration_test -- --test-threads=1 — includes new test_large_single_export_deduplication test (blocked locally by pre-existing N-API linker issue, will pass in CI)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Performance Improvements

    • Reduced redundant symbol lookups by resolving symbols for changed lines once up front.
    • Deduplicated recursive reference tracing across changed symbols, improving analysis speed and memory use.
    • Added better handling when no symbols are found to avoid unnecessary traversal.
  • Tests

    • Added integration test covering large-single-export changes across multiple projects to ensure correct affected-project detection.

…diffs (#47)

Pre-deduplicate changed symbols before recursive reference tracing so that
N changed lines inside the same exported object produce one traversal instead
of N redundant traversals each with a fresh visited set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 24, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e1f219da-2532-4874-a7d7-70decc9c8c26

📥 Commits

Reviewing files that changed from the base of the PR and between 526eb9c and 4b7beea.

📒 Files selected for processing (1)
  • src/core.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/core.rs

📝 Walkthrough

Walkthrough

The symbol-resolution in find_affected_internal was changed to resolve AST symbols for all changed lines once (symbols_by_line), deduplicate those symbols, and run recursive reference tracing per unique symbol using a shared visited set and shared AffectedState. Per-line processing and the process_changed_line helper were removed.

Changes

Cohort / File(s) Summary
Core symbol-tracing refactoring
src/core.rs
Build symbols_by_line up-front; consume it for direct-change reports; flatten to unique_symbols and run recursive tracing once per unique symbol with a shared visited set and shared AffectedState; removed process_changed_line and an implicit-deps comment; added debug logs for lookup failures.
Integration test for deduplication
tests/integration_test.rs
Added test_large_single_export_deduplication which creates a multi-project scenario and asserts affected projects when a single exported object receives a large diff across commits.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 I hopped through symbols, one then many,
Found the same name in a sprawling plenty.
I deduped the lot and traced just once,
Now traces are quick—no more long hunts.
Nibble, hop, shipfast! 🚀

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing exponential time performance issues through symbol deduplication during reference tracing.
Linked Issues check ✅ Passed Changes implement both objectives from #47: pre-deduplicating symbols via symbols_by_line mapping and sharing visited sets, plus adding regression test validating large single-export diffs.
Out of Scope Changes check ✅ Passed All changes in core.rs and integration_test.rs directly address the performance issue and deduplication fix described in #47 without introducing unrelated modifications.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/deduplicate-symbol-tracing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/core.rs">

<violation number="1" location="src/core.rs:187">
P2: Sharing one `visited` set across all changed symbols suppresses later root traversals and can drop independent `project_causes` entries in generated reports.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread src/core.rs
// (set insert is idempotent) but means project_causes may not record every
// independent cause path — a project affected via two different symbols will
// only have the cause from whichever symbol was traced first.
let mut visited = FxHashSet::default();
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Sharing one visited set across all changed symbols suppresses later root traversals and can drop independent project_causes entries in generated reports.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/core.rs, line 187:

<comment>Sharing one `visited` set across all changed symbols suppresses later root traversals and can drop independent `project_causes` entries in generated reports.</comment>

<file context>
@@ -143,23 +158,58 @@ fn find_affected_internal(
+      // (set insert is idempotent) but means project_causes may not record every
+      // independent cause path — a project affected via two different symbols will
+      // only have the cause from whichever symbol was traced first.
+      let mut visited = FxHashSet::default();
+      let mut state = AffectedState {
+        affected_packages: &mut affected_packages,
</file context>
Fix with Cubic

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a known, intentional trade-off documented in the comment directly above the visited set (lines 183-186):

// NOTE: sharing the visited set across symbols is correct for affected_packages
// (set insert is idempotent) but means project_causes may not record every
// independent cause path — a project affected via two different symbols will
// only have the cause from whichever symbol was traced first.

The affected_packages result (which determines CLI output and CI pass/fail) is correct. Only the optional --report causality chain may be incomplete for one path when two changed symbols share a traversal node. This is an acceptable trade-off — the alternative (per-symbol visited sets) is exactly what caused the 10+ minute hang in #47.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/core.rs (1)

508-509: Duplicate and inconsistent step comments.

Lines 508-509 have duplicate comments, and the step numbering throughout the function is inconsistent (e.g., "Step 5" appears twice at lines 74 and 78, "Step 6" appears at lines 82 and 508).

🔧 Suggested fix: Remove duplicate comment
-  // Step 6: Add implicit dependencies
-  // Step 7: Add implicit dependencies
+  // Step 7: Add implicit dependencies
   add_implicit_dependencies(

Consider also renumbering the step comments throughout the function for consistency in a follow-up.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core.rs` around lines 508 - 509, Remove the duplicate comment "Step 6:
Add implicit dependencies" (the repeated lines that appear at the end of the
function) and ensure the function's step comments are unique and sequential;
locate all inline step comments matching the pattern "Step N:" (including the
occurrences of "Step 5" and "Step 6") and delete the duplicated instance(s),
then optionally renumber the step comments within the function so they progress
consistently (e.g., 1 through N) to avoid repeated numbers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/core.rs`:
- Around line 508-509: Remove the duplicate comment "Step 6: Add implicit
dependencies" (the repeated lines that appear at the end of the function) and
ensure the function's step comments are unique and sequential; locate all inline
step comments matching the pattern "Step N:" (including the occurrences of "Step
5" and "Step 6") and delete the duplicated instance(s), then optionally renumber
the step comments within the function so they progress consistently (e.g., 1
through N) to avoid repeated numbers.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6e014c99-84a3-435f-a83d-b17ab0c8fcd8

📥 Commits

Reviewing files that changed from the base of the PR and between 492af00 and 526eb9c.

📒 Files selected for processing (2)
  • src/core.rs
  • tests/integration_test.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

📦 Preview Release Available

A preview release has been published for commit 4b7beea.

Installation

npm install https://github.com/frontops-dev/domino/releases/download/pr-48-4b7beea/front-ops-domino-1.0.2.tgz

Running the preview

npx https://github.com/frontops-dev/domino/releases/download/pr-48-4b7beea/front-ops-domino-1.0.2.tgz affected

Details

@EladBezalel EladBezalel merged commit 6145135 into main Mar 24, 2026
27 of 38 checks passed
@EladBezalel EladBezalel deleted the fix/deduplicate-symbol-tracing branch March 24, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: per-line visited set causes exponential tracing on large single-export diffs

2 participants