Skip to content

fix(checkup): don't fail setup when a stale checkup/issue-* branch already exists (#1338)#1339

Merged
gltanaka merged 14 commits into
mainfrom
fix/issue-1338-stale-checkup-branch
Jun 2, 2026
Merged

fix(checkup): don't fail setup when a stale checkup/issue-* branch already exists (#1338)#1339
gltanaka merged 14 commits into
mainfrom
fix/issue-1338-stale-checkup-branch

Conversation

@sohni-tagirisa
Copy link
Copy Markdown
Collaborator

Summary

Fixes #1338. pdd checkup aborted during worktree setup when a stale checkup/issue-N branch already existed. _setup_worktree (pdd/agentic_checkup_orchestrator.py) deleted the branch but ignored the result of _delete_branch and set has_branch = False unconditionally — so when the delete failed, git worktree add -b failed with "a branch named ... already exists" and the whole checkup died.

Root cause

A crashed or out-of-band-cleaned-up prior run can leave checkup/issue-N registered to a worktree directory that no longer exists. git branch -D then refuses:

error: Cannot delete branch 'checkup/issue-7' checked out at '<gone path>'

That failure was swallowed, and the subsequent git worktree add -b:

fatal: a branch named 'checkup/issue-7' already exists
→ _setup_worktree returns (None, "Failed to create worktree: ...")

Fix

  • Add _prune_worktrees and call git worktree prune before handling the branch, so a branch registered to a removed worktree dir can be deleted/recreated.
  • Honor _delete_branch's return value; only clear has_branch on success.
  • If the branch is still undeletable after pruning (genuinely checked out in a live worktree), return a clear, actionable error naming the conflicting worktree instead of the cryptic Failed to create worktree.

How I tested it

I reproduced the failure against the real _setup_worktree with a temp git repo (branch registered to a worktree dir then rmtree'd) — pre-fix it returned (None, "Failed to create worktree: ... a branch named 'checkup/issue-7' already exists"). After the fix:

scenario before after
stale branch (worktree dir removed) ❌ Failed to create worktree ✅ branch reset, fresh worktree created
plain leftover branch (not checked out) ✅ (unchanged)
branch checked out in a live worktree ❌ cryptic "Failed to create worktree" ✅ clear error: which worktree to remove

Pinned by TestSetupWorktreeStaleBranch (3 real-git tests). Full tests/test_agentic_checkup_orchestrator.py + tests/test_checkup_pr_mode.py324 passed; git diff --check origin/main clean.

Note on pdd fix: this sandbox has no LLM API key, so the LLM-driven pdd fix autonomous flow can't run here. I diagnosed the failure by reproducing it against the real _setup_worktree, fixed the code directly, and added a regression test — the same end state pdd fix would have produced.

🤖 Generated with Claude Code

@sohni-tagirisa sohni-tagirisa self-assigned this Jun 2, 2026
@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

The failing required check is auto-heal, not the unit tests or CodeQL.

The /heal run reached Cloud Build 9f37c7e2-e0c1-4553-893b-2ff1359bf524, and the failure is:

metadata staging verification failed: missing .pdd/meta/agentic_checkup_orchestrator_python.json

pdd-cli generated metadata for agentic_checkup_orchestrator, but .gitignore ignores .pdd/meta/*.json unless each file is explicitly whitelisted, so the generated metadata file was skipped during staging. To unblock this PR, add/whitelist .pdd/meta/agentic_checkup_orchestrator_python.json or adjust the auto-heal metadata staging rule.

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

1 similar comment
@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@RyanTanuki
Copy link
Copy Markdown
Collaborator

/gcbrun

@RyanTanuki
Copy link
Copy Markdown
Collaborator

/heal

6 similar comments
@RyanTanuki
Copy link
Copy Markdown
Collaborator

/heal

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

Copy link
Copy Markdown
Contributor

@gltanaka gltanaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not merge as-is.

The stale removed-worktree failure is real on main, so some fix is needed. I reproduced the base failure with _setup_worktree: a removed registered worktree leaves checkup/issue-* undeletable, then git worktree add -b fails with fatal: a branch named ... already exists.

The blocker is the PR-mode fallback. After _delete_branch fails in _setup_pr_worktree, the code keeps has_branch=True and later intends to recover with git worktree add --force, but before that it still runs:

git fetch <remote> pull/<pr>/head:<branch_name> --force

If <branch_name> is still checked out in a live worktree, Git refuses to update it:

fatal: refusing to fetch into branch 'refs/heads/checkup/pr-...' checked out at '...'

I reproduced this against this PR's actual _setup_pr_worktree; it returns Failed to fetch PR #77... before it ever reaches the forced worktree-add fallback. The test at tests/test_checkup_pr_mode.py mocks subprocess.run and returns success for the fetch path, so it cannot catch this real Git behavior.

Required changes:

  1. Fix the PR-mode branch-deletion-failure path. After pruning, if the PR branch is still checked out in another live worktree, do not fetch into that checked-out branch. Either return a clear actionable error for the live-worktree case, or fetch to an independent temp ref / unique branch / detached worktree and create the checkup worktree from that safe ref.
  2. Add a real-git regression for _setup_pr_worktree that creates a local refs/pull/<n>/head, leaves the scoped checkup/pr-* branch checked out in a live worktree, and exercises the fetch step. The current mock-only test is insufficient.
  3. Reconcile this PR with the duplicate stale-checkup-branch work in #1281/#1299. If #1339 is the PR to keep, keep the diff focused and remove unrelated generated example/architecture churn unless it is required by auto-heal.

Verdict: the underlying bug needs a fix, but #1339 should not merge in its current form.

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@sohni-tagirisa
Copy link
Copy Markdown
Collaborator Author

Tuesday, Jun 2nd

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

Copy link
Copy Markdown
Contributor

@gltanaka gltanaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still do not merge as-is.

I re-tested the updated head (6fb36f40183f0437539fd7201f5947d572f500e1). The PR-mode blocker from the previous review is still present in pdd/agentic_checkup_orchestrator.py: after _delete_branch fails, _setup_pr_worktree still runs the fetch into the checked-out local branch before the forced worktree-add fallback can execute:

git fetch <remote> pull/<pr>/head:<branch_name> --force

With a real local repo and a real local refs/pull/77/head, leaving the scoped checkup/pr-* branch checked out in another live worktree still returns:

Failed to fetch PR #77 ... fatal: refusing to fetch into branch 'refs/heads/checkup/pr-77-...' checked out at '.../live-wt'

So the intended git worktree add --force recovery path is unreachable for the live-worktree PR-mode case.

The PR's new tests still pass, but they are mocked too broadly around subprocess.run; they return success for the fetch command and therefore do not exercise this Git behavior.

Required changes are unchanged:

  1. In _setup_pr_worktree, if pruning does not let the PR branch be deleted, do not fetch into that checked-out branch. Either return a clear actionable live-worktree conflict error, or fetch to a safe independent ref/unique branch and create the checkup worktree from that.
  2. Add a real-git regression that creates a local refs/pull/<n>/head, leaves the scoped checkup/pr-* branch checked out in another live worktree, and verifies the fetch/setup behavior.
  3. Keep or justify the generated prompt/example/architecture churn separately from the stale-branch fix.

Other status: the normal unit/regression/CodeQL checks and auto-heal are green, but there is also a failed heal workflow check caused by authorization (sohni-tagirisa is not a pdd_cloud collaborator). Regardless, the code blocker above is enough to hold the merge.

sohni-tagirisa added a commit that referenced this pull request Jun 2, 2026
…ktree (#1338)

Addresses review on #1339. In PR mode, when `git worktree prune` cannot
free a `checkup/pr-*` branch because it is genuinely checked out in another
live worktree, `_delete_branch` fails and the code used to fall through to
`git fetch pull/<n>/head:<branch> --force`. Git refuses to fetch into a
checked-out branch ("refusing to fetch into branch ... checked out at ..."),
so the forced worktree-add recovery was unreachable and the user saw a
cryptic "Failed to fetch PR #N".

Now `_setup_pr_worktree` detects this case after pruning and returns a clear,
actionable conflict error naming the live worktree to remove — before the
fetch. The stale removed-worktree case still proceeds: prune frees the branch,
it is deleted and re-fetched.

Adds a real-git regression (`TestSetupPrWorktreeStaleBranchRealGit`) that
drives a real `refs/pull/<n>/head` and exercises the fetch step for all three
cases (stale-removed-worktree, plain leftover branch, live-worktree conflict),
replacing the mock-only coverage the review flagged as insufficient. New
helper `_worktree_holding_branch` parses `git worktree list --porcelain`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

Copy link
Copy Markdown
Contributor

@gltanaka gltanaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not ready to merge.

The latest commit fixes the fresh PR setup path I previously flagged: with resume_existing=False, a live checkup/pr-* worktree now returns a clear actionable conflict before git fetch, and the new real-git tests cover that path.

However, the same fetch-into-live-branch failure still exists in the PR-mode resume path. _run_agentic_checkup_orchestrator_inner calls:

_setup_pr_worktree(cwd, pr_owner, pr_repo, pr_number, quiet, resume_existing=True)

when a saved PR worktree path is missing. In that mode _setup_pr_worktree skips the delete/conflict branch, then still runs:

git fetch <remote> pull/<pr>/head:<branch_name> --force

If the scoped checkup/pr-* branch is checked out in another live worktree, Git still refuses the fetch. I reproduced this against the updated head 9dc0a7c1f7683aba39808d8190d0cf351d5d57a6; it still returns the old cryptic form:

Failed to fetch PR #77 ... fatal: refusing to fetch into branch 'refs/heads/checkup/pr-77-...' checked out at '.../live-wt'

Required change:

  • Apply the same live-worktree conflict guard to the resume_existing=True path before fetching, or otherwise avoid fetching into a checked-out branch in that mode too. Add a real-git regression for resume_existing=True with a local refs/pull/<n>/head and the scoped branch checked out in another live worktree.

Verification I ran locally:

  • New targeted tests on this PR: TestSetupWorktreeStaleBranch, TestSetupPrWorktreeStaleBranch, and TestSetupPrWorktreeStaleBranchRealGit all pass (8 passed).
  • Fresh PR setup live-worktree repro now returns the expected clear conflict.
  • PR resume setup live-worktree repro still fails with the old fetch error.

There is also still a failed heal workflow check due to authorization (sohni-tagirisa is not a pdd_cloud collaborator), while the normal unit/regression/CodeQL checks and auto-heal are green.

@sohni-tagirisa
Copy link
Copy Markdown
Collaborator Author

Addressed the remaining change request around the PR resume path.

What changed:

  • _setup_pr_worktree(..., resume_existing=True) now checks whether the scoped PR checkup branch is still checked out in another live worktree before fetching into it.
  • If the branch is live elsewhere, setup returns an actionable error that points to the holding worktree and suggests git worktree remove <path>.
  • Added a real-git regression test covering resume_existing=True with a live worktree holding the target branch, asserting that we fail before the fetch and do not surface the cryptic refusing to fetch into branch ... checked out error.

Verification run locally:

  • python -m py_compile pdd/agentic_checkup_orchestrator.py
  • python -m pytest tests/test_checkup_pr_mode.py::TestSetupPrWorktreeStaleBranch tests/test_checkup_pr_mode.py::TestSetupPrWorktreeStaleBranchRealGit -q -> 6 passed
  • python -m pytest tests/test_agentic_checkup_orchestrator.py::TestSetupWorktreeStaleBranch -q -> 3 passed
  • python -m pytest tests/test_checkup_pr_mode.py tests/test_agentic_checkup_orchestrator.py -q -> 330 passed
  • git diff --check origin/main...HEAD

Current PR head: ae096e851a4b59076c08dc1eab5d1f67eb150b92.

Current remote checks: CodeQL analysis checks pass. heal / auto-heal are still failing because the actor is not authorized as a pdd_cloud collaborator, so the auto-heal job is blocked before running the build/test workflow. A pdd_cloud collaborator will need to authorize or trigger /heal for that external gate.

sohni-tagirisa and others added 7 commits June 2, 2026 15:49
…anch doesn't fail setup (#1338)

`_setup_worktree` deleted an existing `checkup/issue-N` branch but ignored the
result of `_delete_branch` and set `has_branch = False` unconditionally. When a
crashed or out-of-band-cleaned-up prior run left the branch registered to a
worktree directory that no longer exists, `git branch -D` fails ("Cannot delete
branch ... checked out at <gone path>"); the swallowed failure then let
`git worktree add -b` fail with "a branch named ... already exists", aborting
the whole checkup.

Fix:
- `git worktree prune` (new `_prune_worktrees`) before handling the branch, so a
  branch registered to a removed worktree dir can be deleted/recreated.
- Honor `_delete_branch`'s return value; only clear `has_branch` on success.
- If the branch is still undeletable after pruning (genuinely checked out in a
  LIVE worktree), return a clear, actionable error naming the conflict instead
  of the cryptic "Failed to create worktree".

Regression test (TestSetupWorktreeStaleBranch, real git repos): stale branch
from a removed worktree → recreated; plain leftover branch → recreated; branch
in a live worktree → clear actionable error (not "Failed to create worktree").

Closes #1338

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…deleted

Adopt the "deletion failure does not hard fail" contract (matches PR #1299's
test_stale_branch_deletion_failure_does_not_hard_fail). Previously, after
pruning, a still-undeletable checkup/issue-N branch (still checked out by
another worktree) made _setup_worktree return an error. Instead, reuse the
existing branch via `git worktree add --force` so the checkup proceeds rather
than aborting:

- step 2: on _delete_branch failure, keep the branch and log a notice instead
  of returning an error (track deleted_branch=False).
- step 3: when the branch survived deletion, `git worktree add --force <path>
  <branch>` to reuse it (a stale/duplicate worktree registration no longer
  blocks setup), instead of `-b` which would raise "a branch named ... already
  exists".

`git worktree prune` still runs first, so the common stale case deletes cleanly
and gets a fresh `-b` branch; the forced reuse only handles the residual
"branch still checked out" case.

Updated the real-git regression test: the live-worktree case now recovers via
forced reuse (worktree created, HEAD on the reused branch) rather than
returning a clear error. Verified PR #1299's mocked
test_stale_branch_deletion_failure_does_not_hard_fail now passes against this
implementation. Full test_agentic_checkup_orchestrator.py + test_checkup_pr_mode.py:
324 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
prompt-driven-github-staging Bot and others added 5 commits June 2, 2026 15:49
…ktree (#1338)

Addresses review on #1339. In PR mode, when `git worktree prune` cannot
free a `checkup/pr-*` branch because it is genuinely checked out in another
live worktree, `_delete_branch` fails and the code used to fall through to
`git fetch pull/<n>/head:<branch> --force`. Git refuses to fetch into a
checked-out branch ("refusing to fetch into branch ... checked out at ..."),
so the forced worktree-add recovery was unreachable and the user saw a
cryptic "Failed to fetch PR #N".

Now `_setup_pr_worktree` detects this case after pruning and returns a clear,
actionable conflict error naming the live worktree to remove — before the
fetch. The stale removed-worktree case still proceeds: prune frees the branch,
it is deleted and re-fetched.

Adds a real-git regression (`TestSetupPrWorktreeStaleBranchRealGit`) that
drives a real `refs/pull/<n>/head` and exercises the fetch step for all three
cases (stale-removed-worktree, plain leftover branch, live-worktree conflict),
replacing the mock-only coverage the review flagged as insufficient. New
helper `_worktree_holding_branch` parses `git worktree list --porcelain`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sohni-tagirisa sohni-tagirisa force-pushed the fix/issue-1338-stale-checkup-branch branch from ae096e8 to 1c21b4a Compare June 2, 2026 22:52
@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

/heal

Copy link
Copy Markdown
Contributor

@gltanaka gltanaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previously blocking live-worktree fetch paths are fixed.

I verified the updated head (827e44ed23aa38677c1ae000e058bda6c6e1b314) against both repros:

  • Fresh PR setup (resume_existing=False) with the scoped checkup/pr-* branch checked out in another live worktree now returns a clear actionable conflict before fetching.
  • PR resume setup (resume_existing=True) now does the same; it no longer falls through to Failed to fetch PR ... refusing to fetch into branch ....

Targeted regression tests also pass locally:

TestSetupWorktreeStaleBranch
TestSetupPrWorktreeStaleBranch
TestSetupPrWorktreeStaleBranchRealGit
9 passed

All current reported checks are green, including unit tests, public CLI regression, CodeQL, and auto-heal. Approved.

@gltanaka gltanaka merged commit f7b7ca5 into main Jun 2, 2026
8 checks passed
@gltanaka gltanaka deleted the fix/issue-1338-stale-checkup-branch branch June 2, 2026 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Checkup fails when a stale checkup/issue-* branch already exists

4 participants