Skip to content

fix: exclude merged upstream commits from diffSize in push_to_pull_request_branch incremental mode#34139

Merged
pelikhan merged 6 commits into
mainfrom
copilot/fix-patch-base-calculation
May 23, 2026
Merged

fix: exclude merged upstream commits from diffSize in push_to_pull_request_branch incremental mode#34139
pelikhan merged 6 commits into
mainfrom
copilot/fix-patch-base-calculation

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 22, 2026

When an agent merges origin/<baseBranch> into a stale PR branch (e.g., to resolve conflicts), diffSize was computed as git diff <old-PR-head>..<local-branch> — pulling in all upstream commits reachable from the merge but not from the old PR head. A 2-commit agent change (+6/-4 lines) could produce a 9282 KB "incremental diff" containing 480 upstream commits, failing max_patch_size even though the agent's actual contribution was tiny.

Changes

generate_git_patch.cjs — smarter diff base in incremental mode

Before calling computeIncrementalDiffSize, compute the merge-base between origin/<defaultBranch> and the local branch:

  • Merge-base is an ancestor of baseCommitSha (agent didn't merge ahead of the PR head): keep baseCommitSha as the diff base — no behavior change.
  • Merge-base is NOT an ancestor of baseCommitSha (agent merged default-branch commits beyond the PR head): use the merge-base as the effective diff base, cutting out the merged upstream commits.
Before: git diff <old-PR-head>..<local>  →  agent's 2 commits + 480 upstream = 9282 KB
After:  git diff <merge-base(main,local)>..<local>  →  agent's 2 commits only = ~1 KB

The transport patch (format-patch) is unchanged — it still carries the full commit range for git am/bundle apply; only the size enforcement measurement is corrected.

generate_git_patch.test.cjs — two new incremental mode tests

  1. Merge case: verifies diffSize < 1 KB when agent merges 50+ KB of upstream changes (transport patch remains > 1 KB).
  2. No-merge case: verifies diffSize stays correct when the agent adds commits without merging the default branch (regression guard).

pr-sous-chef: update branch requested
run: https://github.com/github/gh-aw/actions/runs/26320547848

Generated by 👨‍🍳 PR Sous Chef · ● 8M ·

Copilot AI and others added 2 commits May 22, 2026 23:55
…incremental mode

When push_to_pull_request_branch targets an existing PR and the agent has
merged the default branch (e.g. origin/main) into the PR branch to resolve
conflicts, the incremental diffSize was inflated to include all of the
default branch's new commits. This caused max_patch_size validation to fail
even when the agent's actual contribution was tiny.

Root cause: the diffSize was computed as git diff <old-PR-head>..<local-branch>,
which includes all commits reachable from the local branch but not from the
old PR head — including the hundreds of upstream commits brought in by the merge.

Fix: before computing diffSize in incremental mode, check if the merge-base
between origin/<defaultBranch> and the local branch is already an ancestor of
the PR's current head (baseCommitSha). If it is NOT (meaning the agent merged
default-branch commits that are ahead of the PR head), use the merge-base as
the effective diff base. This excludes the merged upstream commits from the
size measurement, leaving only the agent's actual contribution.

The fix preserves existing behavior when the agent does not merge the default
branch (the merge-base will be an ancestor of the PR head, so baseCommitSha
remains the diff base).

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…fSize test

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix patch base to use PR's current head SHA fix: exclude merged upstream commits from diffSize in push_to_pull_request_branch incremental mode May 22, 2026
Copilot AI requested a review from pelikhan May 22, 2026 23:57
@pelikhan pelikhan marked this pull request as ready for review May 23, 2026 00:12
Copilot AI review requested due to automatic review settings May 23, 2026 00:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes how diffSize is computed in generateGitPatch(..., { mode: "incremental" }) so that merging the base branch into a stale PR branch (e.g., git merge origin/<baseBranch>) does not cause diffSize to balloon by counting already-upstream commits. This is used for size enforcement (e.g., max_patch_size) without changing the transport patch content.

Changes:

  • Adjust incremental diffSize measurement to optionally use merge-base(origin/<baseBranch>, <localBranch>) as the diff base when upstream commits were merged into the PR branch.
  • Add incremental-mode regression tests covering merge and no-merge scenarios to ensure diffSize remains small while the transport patch may remain large.
Show a summary per file
File Description
actions/setup/js/generate_git_patch.cjs Adds merge-base-based diff base selection for incremental diffSize sizing.
actions/setup/js/generate_git_patch.test.cjs Adds tests ensuring incremental diffSize excludes merged upstream commits and doesn’t regress the no-merge case.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread actions/setup/js/generate_git_patch.cjs Outdated
Comment on lines +480 to +483
if (!mbIsAncestorOfBase) {
debugLog(`Strategy 1 (incremental): agent merged ${defaultBranch} ahead of PR head; using merge-base ${mb} as diff base instead of PR head ${baseCommitSha}`);
diffBaseForSize = mb;
}
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

🧪 Test Quality Sentinel completed test quality analysis.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

⚠️ PR Code Quality Reviewer failed during code quality review.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #34139 does not have the 'implementation' label and has 0 new lines of code in default business logic directories (threshold is 100).

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

🧠 Matt Pocock Skills Reviewer failed during the skills-based review.

@github-actions github-actions Bot mentioned this pull request May 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

⚠️ Test Quality Score: 60/100 — Acceptable with Suggestions

Analyzed 2 test(s): 2 design, 0 implementation, 0 guideline violation(s). Test inflation detected (3.4:1 ratio).

📊 Metrics & Test Classification (2 tests analyzed)
Metric Value
New/modified tests analyzed 2
✅ Design tests (behavioral contracts) 2 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 0 (0%)
Duplicate test clusters 0
Test inflation detected YES (3.4:1 ratio, threshold 2:1)
🚨 Coding-guideline violations 0

Test Classification Details

Test File Classification Issues Detected
should not inflate diffSize when agent merges base branch into a stale PR branch actions/setup/js/generate_git_patch.test.cjs:683 ✅ Design No error cases; test inflation
should preserve correct diffSize when agent does NOT merge base branch actions/setup/js/generate_git_patch.test.cjs:754 ✅ Design No error cases; test inflation

Language Support

Tests analyzed:

  • 🟨 JavaScript (*.test.cjs, *.test.js): 2 tests (vitest)
⚠️ Flagged Tests — Requires Review (3 issue(s))

⚠️ Test 1: should not inflate diffSize when agent merges base branch into a stale PR branch (actions/setup/js/generate_git_patch.test.cjs:683)

Classification: Design test (high value)

Issues detected:

  1. Test inflation (3.4:1 ratio): 165 test lines vs 48 production lines exceeds 2:1 threshold
  2. Missing error coverage: Only tests success path; production code has try/catch blocks that are not exercised

What design invariant does this test enforce?
When an agent merges the base branch into a stale PR and adds a small change, the incremental diffSize measurement must reflect only the agent's contribution (not the merged upstream commits). This prevents false max_patch_size validation failures.

What would break if deleted?
A regression in diffSize calculation would go undetected, causing legitimate agent changes to be rejected due to inflated size metrics.

Strengths:

  • ✅ Tests observable behavior (diffSize metric) not implementation details
  • ✅ Reproduces real production scenario from issue report
  • ✅ No mocking of internals; exercises real git operations
  • ✅ Includes sanity checks (verifies patchSize > diffSize)

Suggested improvements:

  1. Add error path test: Test the fallback behavior when git merge-base or ref resolution fails. Production code has catch (adjustErr) block that should be exercised.

    it("should fall back to PR head when merge-base calculation fails", async () => {
      // Setup: missing default branch ref or corrupted git state
      // Expected: diffSize uses baseCommitSha fallback, logs debug message
    });
  2. Extract setup helpers: The git repository scaffolding (20+ lines) is repeated in both tests. Extract to setupTestRepo() helper to reduce duplication and line count.

  3. Test boundary conditions: What if agent's change is 0 bytes? What if diffSize equals a threshold exactly?


⚠️ Test 2: should preserve correct diffSize when agent does NOT merge base branch (actions/setup/js/generate_git_patch.test.cjs:754)

Classification: Design test (high value)

Issues detected:

  1. Test inflation (same file, contributes to 3.4:1 ratio)
  2. Missing error coverage: Only tests success path

What design invariant does this test enforce?
The fix for merged-upstream inflation must NOT regress the normal incremental case where no merge occurs. This is a critical non-regression test.

What would break if deleted?
A regression that breaks normal incremental mode (agent adds commits without merging base) would go undetected. The new merge-detection logic might incorrectly interfere with the happy path.

Strengths:

  • ✅ Tests behavioral stability of existing incremental mode
  • ✅ Ensures fix doesn't break normal case
  • ✅ Verifies diffSize accuracy without merge scenario

Suggested improvements:

  1. Share setup code: Identical to test 1's git scaffolding — extract to helper
  2. Add error path variant: Test what happens when git operations fail in the non-merge case

🔶 Test Inflation Issue (applies to both tests)

Ratio: 165 test lines / 48 production lines = 3.4:1 (exceeds 2:1 threshold)

Mitigating factors:

  • Production change is algorithmically dense (merge-base logic, ancestry checks, nested try/catch blocks)
  • Test setup requires real git repository scaffolding (bare remote, branches, commits, merges)
  • Tests reproduce complex real-world scenario (480-commit upstream advance, stale PR, agent merge)

Recommendation: While the ratio is high, the tests provide genuine behavioral coverage of a production bug. Consider extracting shared setup code to reduce duplication rather than removing coverage. The inflation is justified by the complexity of the scenario being tested.

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%).

⚠️ Note: Test inflation penalty applied (−10 pts). While both tests verify behavioral contracts, they exceed the 2:1 line ratio guideline. The main gap is missing error coverage — neither test exercises the production code's error handling paths (try/catch blocks).

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

🧪 Test quality analysis by Test Quality Sentinel · ● 1.3M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 60/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). Both tests verify behavioral contracts and provide genuine coverage of the diffSize calculation fix.

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all pull review request comments and resolve them

…ry rewrites

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 23, 2026

@copilot review all pull review request comments and resolve them

Addressed in commit fix: guard merge-base diff adjustment against rebase/force-push history rewrites.

The merge-base adjustment now only applies when baseCommitSha (= origin/<pr-branch>) is an ancestor of the local branch tip. If it is not an ancestor (i.e., history was rewritten by rebase/force-push), the adjustment is skipped entirely and baseCommitSha is kept as the diff base, preventing the undercounting described in the review. A new test (should skip merge-base adjustment and use original base when history was rewritten (rebase)) covers this case.

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot add more tests covering corner cases

@github-actions
Copy link
Copy Markdown
Contributor

``
@copilot review all comments

Please address the unresolved review feedback and push the follow-up fix.

Generated by 👨‍🍳 PR Sous Chef · ● 8M ·

@github-actions
Copy link
Copy Markdown
Contributor

``
Please add corner-case tests and push a follow-up update.

Generated by 👨‍🍳 PR Sous Chef · ● 8M ·

@pelikhan pelikhan merged commit 99ffd28 into main May 23, 2026
6 of 7 checks passed
@pelikhan pelikhan deleted the copilot/fix-patch-base-calculation branch May 23, 2026 02:11
Copilot stopped work on behalf of pelikhan due to an error May 23, 2026 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

push_to_pull_request_branch: patch base should be the PR's current head SHA, not the target repo's base branch

3 participants