create_pull_request: git am fallback also fails and cancels unrelated safe outputs — needs stronger retry

## Problem

A `create_pull_request` safe-output failed because `git am` could not apply the generated patch, and the built-in fallback (re-checking out the original base commit and re-running `git am`) **also failed**. The job then exited with a hard error and the second safe-output message (`notify_source_pr`) was cancelled — even though it was a completely independent operation.

The agent's tokens/context spend were already paid by the time this failure happened in the post-processing step, so a transient failure here is especially expensive.

## Failing run

https://github.com/microsoft/aspire/actions/runs/26078776570/job/76676900386

The agent ran in `microsoft/aspire` and targeted a PR in `microsoft/aspire.dev` (cross-repo `create-pull-request` against `release/13.4`).

## Observed behavior

The fallback path that's supposed to "save" a failed `git am` by recreating the branch at the patch's original base commit *also* failed:

```
Attempting fallback: create PR branch at original base commit...
Original base commit from patch generation: 1cb508fd49bedc4afeb0b8fc008d51689756d853
...
Switched to a new branch 'docs/pr-17234-eager-config-migration-33637d63a4a37bea'
Created branch ... at original base commit 1cb508fd49bedc4afeb0b8fc008d51689756d853
/usr/bin/git am /tmp/gh-aw/aw-microsoft-aspire.dev-docs-pr-17234-eager-config-migration.patch
error: patch failed: src/frontend/config/sidebar/docs.topics.ts:867
error: src/frontend/config/sidebar/docs.topics.ts: patch does not apply
error: patch failed: src/frontend/scripts/check-data-files.mjs:21
error: src/frontend/scripts/check-data-files.mjs: patch does not apply
error: src/frontend/src/content/docs/app-host/hot-reload-and-watch.mdx: already exists in index
error: patch failed: src/frontend/src/content/docs/dashboard/index.mdx:17
error: src/frontend/src/content/docs/dashboard/index.mdx: patch does not apply
... (many more files) ...
error: patch failed: src/frontend/src/data/aspire-integrations.json:2
error: src/frontend/src/data/aspire-integrations.json: patch does not apply
error: patch failed: src/frontend/src/data/github-stats.json:1
error: src/frontend/src/data/github-stats.json: patch does not apply
error: patch failed: src/frontend/tests/e2e/ui-regressions.spec.ts:266
error: src/frontend/tests/e2e/ui-regressions.spec.ts: patch does not apply
Applying: Merge main into release/13.4 (#893)
Patch failed at 0001 Merge main into release/13.4 (#893)
...
Warning: Fallback to original base commit failed: The process '/usr/bin/git' failed with exit code 128
Error: ✗ Message 1 (create_pull_request) failed: Failed to apply patch
Warning: ⚠️ Code push operation 'create_pull_request' failed — remaining safe outputs will be cancelled
⏭ Message 2 (notify_source_pr) cancelled — Cancelled: code push operation failed (create_pull_request: Failed to apply patch)
```

Two distinct problems are visible:

1. **Patch contains a mix of "modify existing file" hunks and one "create new file" hunk** (`hot-reload-and-watch.mdx: already exists in index`) — even at the original base commit `1cb508fd…`. This smells like the cross-repo / different-tree patch-generation issue from #17969, but it surfaces here as the *fallback* also failing rather than the initial `git am`.
2. **Unrelated safe outputs are cancelled** because one code-push message failed. `notify_source_pr` has no dependency on `create_pull_request` succeeding, but it never runs.

## Expected behavior

When a `git am` patch application fails after the agent has already finished its work, the safe-outputs runner should be more resilient:

1. **Stronger retry / repair before giving up.** Cheap recovery attempts that don't require re-prompting the model should be exhausted first, e.g.:
   - `git am --3way` (in addition to plain `git am`).
   - `git apply --3way --reject` followed by `git am --continue` after staging the clean hunks.
   - For files where the only failure is `already exists in index`, fall back to applying the hunks as a modify against the file already present in the tree (re-derive a "modify" diff from the patch's `+` body, using the existing file as base).
   - For cross-repo targets, regenerate the patch against the **target** repo's tree (the long-standing root cause from #17969) — even as a one-shot fix-up after `git am` fails.
2. **Don't cancel independent safe outputs.** A failure in `create_pull_request` should not implicitly cancel `notify_source_pr` (or any other message that doesn't depend on the PR being created). Either declare dependencies explicitly, or have non-code-push messages keep running.
3. **Optionally, push the partial result as a draft PR / artifact** so a human can finish the apply with a clear diff in front of them, instead of throwing the whole agent run away.

## Why this matters

The agent already spent its context window and tokens producing this output. A `git am` mismatch in the post-processing step is a *very* cheap problem (no model calls needed to retry with `--3way`, no model calls needed to regenerate the patch against the target tree). It's the worst possible place to bail out with no retry, because re-running the workflow means re-spending all those tokens.

Tool-like / mechanical steps that run *after* the model has finished should have aggressive retry logic precisely because the expensive part of the run has already been paid for.

## Possibly related

- #17969 — Cross-repo safe-output PRs fail: patch generated as "create new file" when target file already exists. The `already exists in index` line above looks like the same underlying patch-generation bug, but here it manifests in the fallback path too.

## Environment

- Repo: `microsoft/aspire` (source) → `microsoft/aspire.dev` (target)
- Base branch: `release/13.4`
- Failing job: https://github.com/microsoft/aspire/actions/runs/26078776570/job/76676900386


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create_pull_request: git am fallback also fails and cancels unrelated safe outputs — needs stronger retry #33285

Problem

Failing run

Observed behavior

Expected behavior

Why this matters

Possibly related

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

create_pull_request: git am fallback also fails and cancels unrelated safe outputs — needs stronger retry #33285

Description

Problem

Failing run

Observed behavior

Expected behavior

Why this matters

Possibly related

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions