You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
push_to_pull_request_branch: patch base should be the PR's current head SHA, not the target repo's base branch
Summary
When push_to_pull_request_branch targets an existing PR via target: <pr_number>, the safe-outputs job currently computes the patch as agent_working_tree vs the target repo's base branch (e.g. main) — not vs the PR's existing head SHA. For any workflow where the agent has touched the PR branch's history (e.g. merged main in to resolve conflicts), this inflates the patch to include hundreds of upstream commits that are already on main, blowing past max_patch_size even though the agent's actual contribution is tiny.
Concrete repro
Workflow: errors-fix-driver (microsoft/vscode-engineering). Triggered via workflow_dispatch with {pr_number, repo_owner, repo_name, action: "cron_merge_conflict"}.
Top authors of those 480 commits: vritant24 (60), Rob Lourens (31), Matt Bierner (31), Connor Peet (25) — only 16 of 480 authored by vs-code-engineering[bot]
Top file (single chunk): extensions/copilot/package-lock.json — 38,986 lines
Remote PR head meanwhile: 4d0ee846 with 2 commits
The 480 upstream commits are not on the PR branch — they're being inflated into the patch by git format-patch <base>..HEAD where <base> is main.
Failure message:
✗ Message 1 (push_to_pull_request_branch) failed: Incremental diff size (9282 KB) exceeds maximum allowed size (1024 KB). Bundle size: 5486 KB.
In the same batch, three different PRs each produced a patch with identical 9282 KB "incremental diff size" — strong signal that the dominant content is shared upstream history, not per-PR contribution.
Why this happens (from lock-file inspection)
The safe-outputs job's extract-base-branch step resolves base via:
For workflow_dispatch with no base_branch set on the item, all four fall through to main. The patch is then computed against main. If the agent merged main into a stale PR branch as part of its job, the agent_tip vs main diff includes every upstream commit brought in by the merge — even though those commits are already on origin/main and would not need to be transferred at all in a real git push.
When the safe-output item's target resolves to an existing PR on target-repo:
At patch-generation time, resolve the PR's current head.sha via REST (GET /repos/{owner}/{repo}/pulls/{number}).
Use head.sha (or origin/<head_ref>) as the patch base. The patch then represents only commits the agent added on top of the PR's current public tip — which is what push_to_pull_request_branch semantically implies.
Continue to honor item.base_branch / a new item.base_sha as explicit overrides.
Questions for maintainers
Should PR-head as patch base be the default for push_to_pull_request_branch (semantically "update this PR's branch"), or opt-in via a new config flag like incremental-from: pr-head | base-branch?
Is there an existing knob I've missed that would already achieve this?
Would a per-item base_sha override be acceptable as a near-term workaround?
Related companion issue
Separately filing #TBD about a higher-level need: a sync/rebase primitive so merge-conflict-resolution workflows can split "rebase from main" (mechanical) from "agent-authored changes" (LLM), instead of squashing both through the patch transport.
push_to_pull_request_branch: patch base should be the PR's current head SHA, not the target repo's base branchSummary
When
push_to_pull_request_branchtargets an existing PR viatarget: <pr_number>, the safe-outputs job currently computes the patch asagent_working_treevs the target repo's base branch (e.g.main) — not vs the PR's existing head SHA. For any workflow where the agent has touched the PR branch's history (e.g. mergedmainin to resolve conflicts), this inflates the patch to include hundreds of upstream commits that are already onmain, blowing pastmax_patch_sizeeven though the agent's actual contribution is tiny.Concrete repro
errors-fix-driver(microsoft/vscode-engineering). Triggered viaworkflow_dispatchwith{pr_number, repo_owner, repo_name, action: "cron_merge_conflict"}.Frontmatter excerpt:
Agent behavior (per prompt body):
git checkout <pr_head_branch>,git merge origin/main(PR was stale), resolve conflicts, commit.Resulting
agent/aw-<branch>.patchartifact:From <sha> Mon Sep…headers — verified by counting)vs-code-engineering[bot]extensions/copilot/package-lock.json— 38,986 lines4d0ee846with 2 commitsThe 480 upstream commits are not on the PR branch — they're being inflated into the patch by
git format-patch <base>..HEADwhere<base>ismain.Failure message:
In the same batch, three different PRs each produced a patch with identical 9282 KB "incremental diff size" — strong signal that the dominant content is shared upstream history, not per-PR contribution.
Why this happens (from lock-file inspection)
The safe-outputs job's
extract-base-branchstep resolves base via:For
workflow_dispatchwith nobase_branchset on the item, all four fall through tomain. The patch is then computed againstmain. If the agent mergedmaininto a stale PR branch as part of its job, theagent_tip vs maindiff includes every upstream commit brought in by the merge — even though those commits are already onorigin/mainand would not need to be transferred at all in a realgit push.Related prior issues
extract-base-branchderivation problem, same handler familyProposed behavior
When the safe-output item's
targetresolves to an existing PR ontarget-repo:head.shavia REST (GET /repos/{owner}/{repo}/pulls/{number}).head.sha(ororigin/<head_ref>) as the patch base. The patch then represents only commits the agent added on top of the PR's current public tip — which is whatpush_to_pull_request_branchsemantically implies.item.base_branch/ a newitem.base_shaas explicit overrides.Questions for maintainers
push_to_pull_request_branch(semantically "update this PR's branch"), or opt-in via a new config flag likeincremental-from: pr-head | base-branch?base_shaoverride be acceptable as a near-term workaround?Related companion issue
Separately filing #TBD about a higher-level need: a sync/rebase primitive so merge-conflict-resolution workflows can split "rebase from main" (mechanical) from "agent-authored changes" (LLM), instead of squashing both through the patch transport.
Environment
workflow_dispatchwith dynamic PR target via inputsArtifacts
Available on request —
agentartifact from run 26298243465 (8.4 MB) contains the 11.3 MB format-patch and 5.6 MB git bundle.