Skip to content

push_to_pull_request_branch: patch base should be the PR's current head SHA, not the target repo's base branch #34108

@bryanchen-d

Description

@bryanchen-d

push_to_pull_request_branch: patch base should be the PR's current head SHA, not the target repo's base branch

Summary

When push_to_pull_request_branch targets an existing PR via target: <pr_number>, the safe-outputs job currently computes the patch as agent_working_tree vs the target repo's base branch (e.g. main) — not vs the PR's existing head SHA. For any workflow where the agent has touched the PR branch's history (e.g. merged main in to resolve conflicts), this inflates the patch to include hundreds of upstream commits that are already on main, blowing past max_patch_size even though the agent's actual contribution is tiny.

Concrete repro

Frontmatter excerpt:

checkout:
  - repository: microsoft/vscode
    ref: main           # static; agent does `git checkout <head_branch>` inside the prompt
    fetch-depth: 0

safe-outputs:
  push-to-pull-request-branch:
    target: "${{ inputs.pr_number }}"
    target-repo: "microsoft/vscode"
    protected-files: fallback-to-issue
    if-no-changes: warn
    check-branch-protection: false

Agent behavior (per prompt body): git checkout <pr_head_branch>, git merge origin/main (PR was stale), resolve conflicts, commit.

Resulting agent/aw-<branch>.patch artifact:

  • 11.3 MB, 480 commits (From <sha> Mon Sep… headers — verified by counting)
  • Top authors of those 480 commits: vritant24 (60), Rob Lourens (31), Matt Bierner (31), Connor Peet (25) — only 16 of 480 authored by vs-code-engineering[bot]
  • Top file (single chunk): extensions/copilot/package-lock.json — 38,986 lines
  • Remote PR head meanwhile: 4d0ee846 with 2 commits

The 480 upstream commits are not on the PR branch — they're being inflated into the patch by git format-patch <base>..HEAD where <base> is main.

Failure message:

✗ Message 1 (push_to_pull_request_branch) failed: Incremental diff size (9282 KB) exceeds maximum allowed size (1024 KB). Bundle size: 5486 KB.

In the same batch, three different PRs each produced a patch with identical 9282 KB "incremental diff size" — strong signal that the dominant content is shared upstream history, not per-PR contribution.

Why this happens (from lock-file inspection)

The safe-outputs job's extract-base-branch step resolves base via:

item.base_branch || github.base_ref || github.event.pull_request.base.ref || github.event.repository.default_branch

For workflow_dispatch with no base_branch set on the item, all four fall through to main. The patch is then computed against main. If the agent merged main into a stale PR branch as part of its job, the agent_tip vs main diff includes every upstream commit brought in by the merge — even though those commits are already on origin/main and would not need to be transferred at all in a real git push.

Related prior issues

Proposed behavior

When the safe-output item's target resolves to an existing PR on target-repo:

  1. At patch-generation time, resolve the PR's current head.sha via REST (GET /repos/{owner}/{repo}/pulls/{number}).
  2. Use head.sha (or origin/<head_ref>) as the patch base. The patch then represents only commits the agent added on top of the PR's current public tip — which is what push_to_pull_request_branch semantically implies.
  3. Continue to honor item.base_branch / a new item.base_sha as explicit overrides.

Questions for maintainers

  1. Should PR-head as patch base be the default for push_to_pull_request_branch (semantically "update this PR's branch"), or opt-in via a new config flag like incremental-from: pr-head | base-branch?
  2. Is there an existing knob I've missed that would already achieve this?
  3. Would a per-item base_sha override be acceptable as a near-term workaround?

Related companion issue

Separately filing #TBD about a higher-level need: a sync/rebase primitive so merge-conflict-resolution workflows can split "rebase from main" (mechanical) from "agent-authored changes" (LLM), instead of squashing both through the patch transport.

Environment

  • gh-aw: v0.74.8
  • Engine: copilot, claude-opus-4.6, engine_version 1.0.48
  • Workflow type: workflow_dispatch with dynamic PR target via inputs

Artifacts

Available on request — agent artifact from run 26298243465 (8.4 MB) contains the 11.3 MB format-patch and 5.6 MB git bundle.

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions