Skip to content

fix(git): expand changed_lines to full hunk range#63

Merged
EladBezalel merged 3 commits intofrontops-dev:mainfrom
nivnahmias:fix/expand-hunk-range
Apr 20, 2026
Merged

fix(git): expand changed_lines to full hunk range#63
EladBezalel merged 3 commits intofrontops-dev:mainfrom
nivnahmias:fix/expand-hunk-range

Conversation

@nivnahmias
Copy link
Copy Markdown
Contributor

@nivnahmias nivnahmias commented Apr 20, 2026

Closes #62.

Summary

LINE_RE in src/git.rs captured only the starting line of each diff hunk and dropped the count group (,W). With --unified=0 (which get_diff always uses), a multi-line hunk like @@ -3 +3,7 @@ contributed only line 3 to changed_lines, so find_node_at_line in core.rs only looked at line 3. Any exported symbol declared mid-hunk — on, say, line 8 — was invisible. With no traceable symbol, reference traversal was skipped and downstream consumers were silently dropped from the affected set.

This is direction (1) from the issue: capture the count group, expand each hunk's starting line to the full start..start + count range. When ,W is absent git treats the hunk as a single line, so the count group stays optional and defaults to 1.

Change

src/git.rs:

-  LazyLock::new(|| Regex::new(r"@@ -.* \+(\d+)(?:,\d+)? @@").expect("line regex is valid"));
+  LazyLock::new(|| Regex::new(r"@@ -.* \+(\d+)(?:,(\d+))? @@").expect("line regex is valid"));
let mut changed_lines: Vec<usize> = line_regex
  .captures_iter(file_diff)
  .filter_map(|caps| {
    let start: usize = caps.get(1)?.as_str().parse().ok()?;
    let count: usize = caps
      .get(2)
      .and_then(|m| m.as_str().parse().ok())
      .unwrap_or(1);
    Some(start..start + count)
  })
  .flatten()
  .collect();

Tests

Added three new unit tests in src/git.rs:

  • test_parse_diff_multi_line_hunk_covers_full_range — mirrors the issue's reproduction. Hunk @@ -3 +3,7 @@ must produce [3, 4, 5, 6, 7, 8, 9]. This is the regression test.
  • test_parse_diff_shorthand_count_defaults_to_one — pins the fallback for hunks emitted without ,W (e.g. @@ -1 +1 @@).
  • test_parse_diff_pure_deletion_hunk_contributes_zero — uses a diff with a pure-deletion hunk (+Z,0) paired with a real addition hunk, so the assertion targets the "deletion contributes nothing" behavior rather than the file-skip branch.

Updated two existing tests whose expected output changes because their hunks now contribute every line rather than just the first:

  • test_parse_diff — the second file's hunk @@ -102,0 +103,2 @@ now yields [103, 104] (was [103]).
  • test_parse_diff_renamed_file_with_changes — hunks @@ -5,1 +5,1 @@ and @@ -20,0 +20,3 @@ now yield [5, 20, 21, 22] (was [5, 20]).

All 13 git::tests::* pass (cargo test --lib), along with the full 184-test --lib suite. cargo fmt -- --check passes. cargo clippy on src/git.rs emits no warnings.

Caller impact

Reviewed every use of ChangedFile.changed_lines:

  • core.rs:147-161 — for each line, looks up a symbol via find_node_at_line, then deduplicates via unique_symbols: FxHashSet<&String>. Extra lines inside the same exported declaration collapse into one trace; extra lines outside any declaration contribute an empty Vec<String> and do nothing. No behavioural change except that the bug is fixed.
  • core.rs:269-289 — one AffectCause::DirectChange { line } entry per line for asset causes. The report will now list more cause entries per affected package on multi-line hunks, which is a correctness improvement (previously under-reported).
  • lockfile.rs — only reads file_path; never reads changed_lines outside test fixtures.

No changes to ChangedFile, no new public API.

Out of scope

Directions (2) and (3) from the issue (fallback to all exports; old/new AST matching) are deferred as future improvements per the maintainer's triage.

Manual validation

Against the minimal fixture from the issue (/tmp/domino-repro, a two-package Rush workspace where package-b imports a symbol from package-a's foo.ts):

Branch Before After
simple-edit (1-char body change) [package-a, package-b] [package-a, package-b]
rewrite (interface inserted before export) [package-a] [package-a, package-b]

Both branches now correctly mark package-b as affected.

Summary by CodeRabbit

  • Bug Fixes

    • Git diff parsing now correctly recognizes optional hunk lengths and records the full inclusive new-side line ranges for multi-line hunks, and preserves files that are deletion-only with an empty changed-line list.
  • Tests

    • Added and updated regression tests for multi-line hunks, shorthand hunk counts, pure-deletion hunks, and full-range line coverage.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ea68f595-5ab8-4896-bf0a-16d12cc7737f

📥 Commits

Reviewing files that changed from the base of the PR and between c19678c and 4f06b53.

📒 Files selected for processing (1)
  • src/git.rs

📝 Walkthrough

Walkthrough

This change updates unified-diff parsing to capture optional hunk lengths and to expand each hunk into an inclusive new-side line range; it preserves files that are deletion-only by returning an empty changed-lines list and updates/extends unit tests accordingly.

Changes

Cohort / File(s) Summary
Diff parsing & tests
src/git.rs
Hunk-header regex now captures optional new-side length; parse_diff expands hunks into full inclusive new-side ranges (Z..Z+W, W defaults to 1); treats pure-deletion hunks as keeping the file with empty changed_lines; updated and added unit tests for multiline hunks, shorthand counts, and deletion-only cases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I sniff the hunks, I parse each line,
Counts once hidden now brightly shine,
From Z to Z+W I leap and bound,
Empty deletions kept safe and sound,
A rabbit's hop through diffs divine.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly summarizes the main change: expanding changed_lines to cover the full hunk range instead of just the starting line.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/git.rs (1)

195-219: ⚠️ Potential issue | 🟠 Major

Keep deletion-only text files in ChangedFile.

For a file whose only hunks are +Z,0, the expanded ranges are empty, then Line 218 drops the file entirely. That turns a real source change, such as deleting an exported symbol, into “no changed file”.

🐛 Proposed fix
-      let mut changed_lines: Vec<usize> = line_regex
+      let ranges: Vec<_> = line_regex
         .captures_iter(file_diff)
         .filter_map(|caps| {
           let start: usize = caps.get(1)?.as_str().parse().ok()?;
           let count: usize = caps
             .get(2)
             .and_then(|m| m.as_str().parse().ok())
             .unwrap_or(1);
           Some(start..start + count)
         })
-        .flatten()
         .collect();
+      let has_text_hunks = !ranges.is_empty();
+      let mut changed_lines: Vec<usize> = ranges.into_iter().flatten().collect();
 
       if changed_lines.is_empty() {
         if is_rename_or_copy {
           changed_lines.push(1);
+        } else if has_text_hunks {
+          debug!("Only deletion hunks found for file: {}", file_path);
         } else if file_diff
           .lines()
           .any(|line| line.starts_with("Binary files"))

Add a regression test for the skipped branch:

#[test]
fn test_parse_diff_deletion_only_hunk_keeps_file_with_no_lines() {
  let diff = r#"diff --git a/src/foo.ts b/src/foo.ts
index 1234567..abcdefg 100644
--- a/src/foo.ts
+++ b/src/foo.ts
@@ -5,3 +5,0 @@ export function foo() {
-  deleted one
-  deleted two
-  deleted three
"#;

  let result = parse_diff(diff).unwrap();
  assert_eq!(result.len(), 1);
  assert_eq!(result[0].file_path.to_str().unwrap(), "src/foo.ts");
  assert!(result[0].changed_lines.is_empty());
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/git.rs` around lines 195 - 219, The code currently drops files when
changed_lines is empty, which loses deletion-only hunks (e.g. +Z,0); update
parse_diff so it detects when line_regex captures include a hunk with count == 0
(i.e. deletion-only) and treat that as a valid change: in the captures_iter loop
(where line_regex, start, count and changed_lines are computed) set a flag (e.g.
deletion_only_hunk) when count == 0, and when changed_lines.is_empty() do not
return None if deletion_only_hunk is true — instead construct a ChangedFile with
an empty changed_lines (preserving the file entry). Ensure the logic that
currently checks is_rename_or_copy and binary files remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/git.rs`:
- Line 460: Replace the inline external URL in the test comment that currently
reads "Regression test for https://github.com/frontops-dev/domino/issues/62."
with an obfuscated/local reference such as "Regression test for issue `#62`" (or
"Regression test for issue 62") so the repository name is removed; locate the
comment string in src/git.rs (the line containing "Regression test for
https://.../domino/issues/62") and update it accordingly.

---

Outside diff comments:
In `@src/git.rs`:
- Around line 195-219: The code currently drops files when changed_lines is
empty, which loses deletion-only hunks (e.g. +Z,0); update parse_diff so it
detects when line_regex captures include a hunk with count == 0 (i.e.
deletion-only) and treat that as a valid change: in the captures_iter loop
(where line_regex, start, count and changed_lines are computed) set a flag (e.g.
deletion_only_hunk) when count == 0, and when changed_lines.is_empty() do not
return None if deletion_only_hunk is true — instead construct a ChangedFile with
an empty changed_lines (preserving the file entry). Ensure the logic that
currently checks is_rename_or_copy and binary files remains unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4ea49eed-cd58-4e2d-bf1e-cd44ecc7dcd5

📥 Commits

Reviewing files that changed from the base of the PR and between d5f105b and c19678c.

📒 Files selected for processing (1)
  • src/git.rs

Comment thread src/git.rs Outdated
nivnahmias added a commit to nivnahmias/domino that referenced this pull request Apr 20, 2026
Two follow-ups from code review on frontops-dev#63:

1. A file whose only hunks are deletions (`+Z,0`) expanded to an empty
   range and was dropped by the `return None` branch in `parse_diff`.
   That regressed the prior behaviour, where the starting line was still
   captured and the file kept. Deleting an exported symbol is a real
   source change: the file's owning package should remain affected even
   if there are no new-file lines to AST-lookup. Track whether any hunk
   headers were seen (even zero-count ones) and keep the file with empty
   `changed_lines` in that case.

2. Obfuscate the repo URL in the regression test doc comment per review
   preference.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/git.rs">

<violation number="1" location="src/git.rs:203">
P2: Pure-deletion hunks can now disappear from `parse_diff` output because the new range expansion returns no lines for `+Z,0`, and the empty-line fallback drops the file.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread src/git.rs
Copy link
Copy Markdown
Collaborator

@EladBezalel EladBezalel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, minimal fix that directly addresses the root cause from #62. The regex and range expansion are correct. A few suggestions below, nothing blocking.

Performance note

find_node_at_line is now called once per expanded line instead of once per hunk. A 500-line rewrite → 500 AST traversals instead of 1. The FxHashSet dedup at core.rs:200 prevents redundant reference traversal but not the redundant AST lookups. In practice most hunks are small and AST traversal is fast, so this is unlikely to be measurable for typical diffs. A batch find_nodes_in_range(start, end) API would be a nice follow-up optimization but shouldn't block this correctness fix.

Pre-existing: pure-deletion files silently dropped

When every hunk in a file is a deletion (+Z,0), changed_lines is empty and parse_diff returns None — the file is excluded. The deletion test in this PR sidesteps it by pairing with an addition hunk. Not introduced by this PR, but worth a separate issue.

Comment thread src/git.rs
Some(start..start + count)
})
.flatten()
.collect();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the filter_map chain silently swallows parse errors via .ok()?. Since \d+ makes non-numeric captures impossible this is low risk, but it's inconsistent with the warn! usage elsewhere in this function. Consider:

let start: usize = caps.get(1)?.as_str().parse().ok()?;
// could be:
let start: usize = match caps.get(1)?.as_str().parse() {
    Ok(v) => v,
    Err(_) => { warn!("Failed to parse hunk start line"); return None; }
};

Comment thread src/git.rs
+ multiplier: number;
+}
+
+export const foo = (x: number, options: FooOptions = { offset: 1, multiplier: 3 }): number =>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regression test uses @@ -3 +3,7 @@ (asymmetric, no old-side count) — the exact format from issue #62. Consider adding a case with the common symmetric format @@ -3,7 +3,7 @@ to confirm the greedy .* handles both correctly. A future regex change could break the common case without this coverage.

@nivnahmias
Copy link
Copy Markdown
Contributor Author

nivnahmias commented Apr 20, 2026

Thanks for the review!

Quick clarification on the pure-deletion finding — it's actually a regression my PR introduced, not pre-existing. In the old regex @@ -.* \+(\d+)(?:,\d+)? @@, the count group was non-capturing, so @@ -5,3 +5,0 @@ still captured 5 as the start and the file was kept (with changed_lines = [5]). After my change, count = 0 produced an empty range and the file got dropped by parse_diff. Since it's part of the same commit's blast radius, I addressed it inline in 598ebd7 rather than leaving it as a separate follow-up:

  • parse_diff now tracks whether any hunk headers were seen (even +Z,0 ones) and keeps the file with empty changed_lines in that case, so its owning package is still marked via core.rs.
  • Added test_parse_diff_deletion_only_file_kept_with_empty_lines to lock the behaviour in.

Also obfuscated the external repo URL in the regression test comment per the other bot review.

Re: the perf note — agreed, a find_nodes_in_range batch API sounds like the right long-term shape.

@EladBezalel
Copy link
Copy Markdown
Collaborator

Good catch on the deletion regression — you're right, the old regex captured 5 from @@ -5,3 +5,0 @@ as a start line (count was non-capturing), so the file was kept. The has_hunks fix in 598ebd7 is clean.

The updated diff looks good. Two minor items still open from my inline comments — both are nits, not blocking:

  1. Silent parse error (line 203): .ok()? still swallows parse failures silently. Low risk since \d+ prevents non-numeric captures, but a warn! would be consistent with the rest of the function.

  2. Symmetric hunk test: No test covers @@ -3,7 +3,7 @@ (the common format where the old-side also has a count). The greedy .* handles it, but a test would guard against future regex changes.

Neither is blocking — LGTM on the fix.

Copy link
Copy Markdown
Collaborator

@EladBezalel EladBezalel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness fix for a real bug, minimal scope, solid test coverage including the deletion-only regression catch. LGTM.

nivnahmias added a commit to nivnahmias/domino that referenced this pull request Apr 20, 2026
Addresses non-blocking reviewer feedback on frontops-dev#63:

- Previously `.ok()?` silently swallowed `usize` parse failures on the
  hunk start and count captures. The `\d+` in `LINE_RE` guarantees the
  captures are numeric so the only realistic failure mode is overflow,
  but a `warn!` matches the rest of the function's logging posture.
- Added `test_parse_diff_symmetric_hunk_with_old_and_new_counts` for the
  `@@ -3,3 +3,3 @@` shape. The greedy `.*` in `LINE_RE` already handles
  it, but a test locks it in against future regex changes.
LINE_RE captured only the starting line of each diff hunk, dropping the
rest. With --unified=0 (which get_diff always uses), a multi-line hunk
like `@@ -3 +3,7 @@` contributed only line 3 to `changed_lines`, so
`find_node_at_line` never examined line 8 where the exported symbol
actually lived. Reference traversal was skipped and downstream consumers
silently dropped from the affected set.

Capture the count group and expand each hunk's starting line to the
full `start..start + count` range. When `,W` is omitted git treats
the hunk as a single line (count = 1), so the regex's count group stays
optional and defaults to 1.

Add regression coverage:

- test_parse_diff_multi_line_hunk_covers_full_range mirrors the issue
  reproduction (`@@ -3 +3,7 @@`) and asserts the full range is recorded.
- test_parse_diff_shorthand_count_defaults_to_one pins the fallback for
  hunks lacking the `,W` suffix.
- test_parse_diff_pure_deletion_hunk_contributes_zero checks that a
  deletion hunk (`+Z,0`) adds nothing and doesn't trip the file-skip
  branch when paired with a real addition hunk.

Two existing tests updated to assert the newly-covered lines:
  test_parse_diff                           [103] -> [103, 104]
  test_parse_diff_renamed_file_with_changes [5, 20] -> [5, 20, 21, 22]

Closes frontops-dev#62
Two follow-ups from code review on frontops-dev#63:

1. A file whose only hunks are deletions (`+Z,0`) expanded to an empty
   range and was dropped by the `return None` branch in `parse_diff`.
   That regressed the prior behaviour, where the starting line was still
   captured and the file kept. Deleting an exported symbol is a real
   source change: the file's owning package should remain affected even
   if there are no new-file lines to AST-lookup. Track whether any hunk
   headers were seen (even zero-count ones) and keep the file with empty
   `changed_lines` in that case.

2. Obfuscate the repo URL in the regression test doc comment per review
   preference.
Addresses non-blocking reviewer feedback on frontops-dev#63:

- Previously `.ok()?` silently swallowed `usize` parse failures on the
  hunk start and count captures. The `\d+` in `LINE_RE` guarantees the
  captures are numeric so the only realistic failure mode is overflow,
  but a `warn!` matches the rest of the function's logging posture.
- Added `test_parse_diff_symmetric_hunk_with_old_and_new_counts` for the
  `@@ -3,3 +3,3 @@` shape. The greedy `.*` in `LINE_RE` already handles
  it, but a test locks it in against future regex changes.
@nivnahmias nivnahmias force-pushed the fix/expand-hunk-range branch from 0fb8b5e to 4f06b53 Compare April 20, 2026 10:53
@github-actions
Copy link
Copy Markdown
Contributor

📦 Preview Release Available

A preview release has been published for commit 4f06b53.

Installation

npm install https://github.com/frontops-dev/domino/releases/download/pr-63-4f06b53/front-ops-domino-1.2.2.tgz

Running the preview

npx https://github.com/frontops-dev/domino/releases/download/pr-63-4f06b53/front-ops-domino-1.2.2.tgz affected

Details

@EladBezalel EladBezalel merged commit 0d1898c into frontops-dev:main Apr 20, 2026
26 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

affected silently misses downstream consumers when hunk start line is a non-export (imports / interfaces)

2 participants